perm filename ISSUES.1[COM,LSP] blob
sn#874729 filedate 1989-06-23 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00042 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00007 00002 ∂13-Jun-89 1553 CL-Cleanup-mailer Issue: ADJUST-ARRAY-NOT-ADJUSTABLE (version 10)
C00026 00003 ∂11-Jun-89 1226 CL-Cleanup-mailer issue DYNAMIC-EXTENT-FUNCTION, version 2
C00035 00004 ∂13-Nov-88 1534 CL-Cleanup-mailer Issue: FORMAT-ROUNDING (Version 1)
C00038 00005 ∂08-Oct-88 1751 X3J13-mailer DRAFT Issue: HASH-TABLE-PRINTED-PREPRESENTATION (Version 2)
C00048 00006 ∂20-Mar-89 1241 CL-Cleanup-mailer Issue: HASH-TABLE-SIZE (version 1)
C00060 00007 ∂06-Feb-89 1354 CL-Cleanup-mailer Issue: IGNORE-VARIABLE (Version 1)
C00070 00008 ∂11-Apr-89 0714 CL-Cleanup-mailer Issue: LOAD-TRUENAME (Version 4)
C00082 00009 ∂15-Jun-89 0921 X3J13-mailer issue MACRO-CACHING, version 3
C00092 00010 ∂22-Mar-89 1054 CL-Cleanup-mailer PRETTY-PRINT-INTERFACE, version 4
C00130 00011 ∂26-Jan-89 1215 CL-Cleanup-mailer Issue: PRINT-CASE-PRINT-ESCAPE-INTERACTION (Version 1)
C00142 00012 ∂25-Mar-89 2231 X3J13-mailer **DRAFT** Issue: READ-CASE-SENSITIVITY (Version 2)
C00159 00013 ∂22-Mar-89 0931 X3J13-mailer Issue: SETF-MULTIPLE-STORE-VARIABLES (Version 2)
C00169 00014 ∂22-Jun-89 1251 X3J13-mailer issue SYNTACTIC-ENVIRONMENT-ACCESS, version 10
C00216 00015 ∂11-Jan-89 2316 X3J13-mailer Issue: THE-AMBIGUITY (Version 2)
C00224 00016 ∂11-Jan-89 2346 X3J13-mailer Issue: UNDEFINED-VARIABLES-AND-FUNCTIONS (Version 1)
C00235 00017 ∂17-Mar-89 2126 CL-Cleanup-mailer New issue: WITH-OPEN-FILE-DOES-NOT-EXIST
C00241 00018 ∂16-Mar-89 1045 X3J13-mailer DRAFT Issue: CONDITION-RESTARTS (Version 1)
C00261 00019 ∂25-Mar-89 2239 X3J13-mailer **DRAFT** Issue: ERROR-CHECKING-IN-NUMBERS-CHAPTER (Version 1)
C00292 00020 ∂23-Mar-89 1504 X3J13-mailer **DRAFT** Issue: PATHNAME-CANONICAL-TYPE (Version 1)
C00306 00021 ∂16-Jun-89 2239 X3J13-mailer Issue: PATHNAME-COMPONENT-CASE (version 5)
C00322 00022 ∂16-Jun-89 2153 X3J13-mailer Issue: PATHNAME-COMPONENT-VALUE (version 3)
C00342 00023 ∂23-Mar-89 1503 X3J13-mailer **DRAFT** Issue: PATHNAME-EXTENSIONS (Version 1)
C00356 00024 ∂21-Jun-89 1507 X3J13-mailer Issue: PATHNAME-LOGICAL (version 3)
C00412 00025 ∂23-Mar-89 2059 X3J13-mailer **DRAFT** Issue: PATHNAME-PRINT-READ (Version 1)
C00420 00026 ∂16-Jun-89 2225 X3J13-mailer Issue: PATHNAME-SUBDIRECTORY-LIST (version 7)
C00444 00027 ∂23-Mar-89 2059 X3J13-mailer **DRAFT** Issue: PATHNAME-SYNTAX-ERROR-TIME (Version 1)
C00460 00028 ∂16-Jun-89 2126 X3J13-mailer Issue: PATHNAME-SYSTEM-TYPE (version 2)
C00467 00029 ∂19-Jun-89 1545 X3J13-mailer Issue: PATHNAME-WILD (version 6)
C00496 00030 ∂19-Jun-89 0911 X3J13-mailer Issue: BIT-ARRAY-FUNCTIONS (version 6)
C00517 00031 ∂19-Jun-89 0847 X3J13-mailer Issue: DATA-IO (version 7)
C00535 00032 ∂19-Jun-89 0851 X3J13-mailer Issue: FLOAT-UNDERFLOW (version 3)
C00547 00033 ∂19-Jun-89 0914 X3J13-mailer Issue: MAP-INTO (version 2)
C00553 00034 ∂23-May-89 1148 CL-Cleanup-mailer Issue: STRING-COERCION (version 2)
C00560 00035 ∂23-Mar-89 1527 X3J13-mailer issue DEFINE-OPTIMIZER, version 6
C00585 00036 ∂12-Dec-88 1434 X3J13-mailer Issue: PROCLAIM-LEXICAL (Version 9)
C00607 00037 ∂21-Jun-89 1507 CL-Compiler-mailer Re: Issue: COMPILER-DIAGNOSTICS
C00628 00038 ∂15-Jun-89 0824 X3J13-mailer issue CLOS-MACRO-COMPILATION, version 4
C00640 00039 ∂15-Jun-89 0918 X3J13-mailer issue COMPILED-FUNCTION-REQUIREMENTS, version 6
C00655 00040 ∂15-Jun-89 0913 X3J13-mailer issue COMPILE-FILE-SYMBOL-HANDLING, version 4
C00673 00041 Forum: Compiler
C00680 00042 Issue: PROCLAIM-ETC-IN-COMPILE-FILE
C00692 ENDMK
C⊗;
∂13-Jun-89 1553 CL-Cleanup-mailer Issue: ADJUST-ARRAY-NOT-ADJUSTABLE (version 10)
Received: from STONY-BROOK.SCRC.Symbolics.COM by SAIL.Stanford.EDU with TCP; 13 Jun 89 15:53:25 PDT
Received: from EUPHRATES.SCRC.Symbolics.COM by STONY-BROOK.SCRC.Symbolics.COM via CHAOS with CHAOS-MAIL id 610864; 13 Jun 89 18:55:13 EDT
Date: Tue, 13 Jun 89 18:55 EDT
From: David A. Moon <Moon@STONY-BROOK.SCRC.Symbolics.COM>
Subject: Issue: ADJUST-ARRAY-NOT-ADJUSTABLE (version 10)
To: CL-Cleanup@sail.stanford.edu
Message-ID: <19890613225538.4.MOON@EUPHRATES.SCRC.Symbolics.COM>
Version 5 of this proposal passed with amendments at the January
1989 X3J13 meeting. However, the amendments were found to result
in an inconsistent proposal, and it was also pointed out that some
related problems with simple-arrays were not addressed. Since then
there has been a great deal of private discussion, and review of
various versions of the proposal including ones earlier than 5.
The result is this proposal, which is believed to be acceptable to
everyone and is being offered for a vote in June to replace the
January version that was already voted in.
Issue: ADJUST-ARRAY-NOT-ADJUSTABLE
References: ADJUST-ARRAY (p297), ADJUSTABLE-ARRAY-P (p293),
MAKE-ARRAY (pp286-289), simple arrays (p28, 289),
simple strings with fill pointers (p299)
Category: CLARIFICATION and CHANGE
Edit history: 22-Apr-87, Version 1 by Pitman
15-Nov-88, Versions 2a,2b,2c by Pitman
02-Dec-88, Version 3 by Pitman
11-Jan-89, Version 4 by Pitman
16-Jan-89, Version 5, by Gabriel. Amended at the meeting to shorten.
23-Jan-89, Version 6, by Moon. Shorten without the bug introduced
by the amendment, add clarification of SIMPLE-ARRAY type.
15-Feb-89, Version 7, by Pitman. Minor changes per comments from
RPG and Dalton.
11-Mar-89, Version 8, by Pitman. Change category, add endorsements.
17-Mar-89, Version 9, by Moon, fix wording and examples to make it
clear that the semantics of simple-array is unchanged.
6-Jun-89, Version 10, by Moon and Gabriel, do over.
Problem Description:
There are a number of unclear passages in CLtL related to simple arrays
and adjustable arrays. There is disagreement on precisely how these
passages are to be interpreted, and no one is happy with the fact that
ADJUST-ARRAY works only on an implementation-dependent subset of arrays.
The description of the :ADJUSTABLE option to MAKE-ARRAY on p288 says that
``the argument, if specified and not NIL, indicates that it must be
possible to alter the array's size dynamically after it is created. This
argument defaults to NIL.'' The description of the :ADJUSTABLE option
does not say what MAKE-ARRAY will do if the argument is unsupplied or
explicitly NIL.
The description of ADJUSTABLE-ARRAY-P on p293 says that it is true ``if
the argument (which must be an array) is adjustable, and otherwise
false.'' However, the description of MAKE-ARRAY makes it clear that this
is not necessarily the same as asking if the array was created with
:ADJUSTABLE T. If ADJUSTABLE-ARRAY-P returns NIL, you know that
:ADJUSTABLE NIL was supplied (or no :ADJUSTABLE option was supplied), but
if ADJUSTABLE-ARRAY-P returns T, then there is no information about
whether :ADJUSTABLE was used.
The description of ADJUST-ARRAY on pp297-298 says that it is ``not
permitted to call ADJUST-ARRAY on an array that was not created with the
:ADJUSTABLE option.'' This is inconsistent with ADJUSTABLE-ARRAY-P.
The definition of SIMPLE-ARRAY on p.28 says ``an array that is not
displaced to another array, has no fill pointer, and is not to have its
size adjusted dynamically after creation is called a simple array.''
It is left unclear whether this is an implication or an equivalence,
i.e. whether there can be other simple arrays as well.
CLtL p.299 appears to refer to simple strings with fill pointers,
suggesting that it is an implication, but similar language is used for
equivalences in other parts of CLtL.
Proposal (ADJUST-ARRAY-NOT-ADJUSTABLE:IMPLICIT-COPY)
1. If MAKE-ARRAY is called with the :ADJUSTABLE, :FILL-POINTER,
and :DISPLACED-TO arguments each either unspecified or false, the
resulting array is a simple array. (This just repeats what CLtL
says on page 289, it's here to aid in understanding the next point.)
2. If MAKE-ARRAY is called with one or more of the :ADJUSTABLE,
:FILL-POINTER, or :DISPLACED-TO arguments true, whether the
resulting array is simple is unspecified.
3. It is permitted to call ADJUST-ARRAY on any array. (Remove the
restriction documented at the bottom of p.297.)
4. If ADJUST-ARRAY is applied to an array created with :ADJUSTABLE true,
the array returned is EQ to its first argument. It is not specified
whether ADJUST-ARRAY returns an array EQ to its first argument for any
other arrays. If the array returned by ADJUST-ARRAY is not EQ to its
first argument, the consequences of any reference to the original array
are undefined.
5. The predicate ADJUSTABLE-ARRAY-P is true if and only if ADJUST-ARRAY
will return a value EQ to this array when given this array as its first
argument.
Clarifications and Logical Consequences:
a. There is no specified way to create an array for which ADJUSTABLE-ARRAY-P
definitely returns NIL.
b. There is no specified way to create an array that is non-simple.
c. The definition of SIMPLE-ARRAY on p.28 is taken to be an implication,
not an equivalence. This is either a clarification or a change depending
on one's prior reading of that definition.
d. The meaning of ADJUSTABLE-ARRAY-P is changed.
e. As with such functions as DELETE and NCONC, textbooks should
instruct programmers to be careful to receive the value returned by
ADJUST-ARRAY, as it might not be EQ to the first argument.
Rationale:
Points 3 and 4 eliminate the problem of ADJUST-ARRAY only working on a
subset of arrays, by changing it to work on all arrays. It remains
implementation-dependent whether the array is modified in place or
copied, i.e. whether the result is EQ to the argument, however many other
functions in Common Lisp have similar implementation-dependent behavior.
Implementation-dependent storage allocation or reuse is considered
more benign than implementation-dependent applicability of an operation.
Point 3 recognizes that ADJUST-ARRAY offers features that are offered by
no other function and which are useful in cases involving non-adjustable
arrays (for what amounts to copying). This change would allow an
expression such as:
(SETQ X (ADJUST-ARRAY X ...))
to work reliably. Those desiring the old behavior could do:
(IF (OR (NOT (ADJUSTABLE-ARRAY-P X))
(NOT (EQUAL (ARRAY-RANK X) (LENGTH NEW-DIMENSIONS))))
(ERROR "Array cannot be adjusted."))
to get the old style error checking.
Point 5 recycles the name ADJUSTABLE-ARRAY-P as a test for whether an
array is adjusted in place or by copying.
Point 2 preserves the raison d'etre of simple arrays, which is to provide
a portable interface to implementation-dependent specialized arrays that
trade decreased functionality for faster access. A proposed alternative
was to specify a way to create an array that is guaranteed not to be
simple. This would have made (typep (make-array ...) 'simple-array)
return the same value in all implementations, but would have required
large changes to some implementations and would be of little benefit to
users. Users need to know that certain arrays are simple, so they can
put in declarations and get higher performance, but users have no need to
be able to create arrays that are definitely non-simple (for lower
performance) or definitely non-adjustable.
Examples:
1. The following program is conforming.
(defun double (a)
(adjust-array a (* (length a) 2)))
(double (make-array 30))
2. The following program is conforming. In no implementation is the
type declaration violated.
(let ((a (make-array 100)))
(declare (simple-array a))
(frob a))
3. The following program is non-conforming. The consequences of this
program are undefined because the type declaration is violated in some
implementations.
(let ((a (make-array 100 :adjustable t)))
(declare (simple-array a))
(frob a))
Current Practice:
Every correct CLtL implementation conforms to points 1 and 2. It is
unlikely that any implementation currently exists that conforms to points
3, 4, and 5. Points 3 and 4 involve additions to an implementation to
support the copying form of ADJUST-ARRAY. Point 5 may involve a change
to ADJUSTABLE-ARRAY-P or may be able to use the existing implementation
of the function.
Symbolics Genera makes :ADJUSTABLE NIL arrays adjustable in most cases,
and ignores adjustability in deciding whether an array is a SIMPLE-ARRAY.
The arrays that are internally simple in Symbolics Genera are a different
subset of arrays from the type SIMPLE-ARRAY, because simplicity in that
implementation depends on the rank and total-size as well as on the
fill-pointer and displacement, thus Genera does not use the type
SIMPLE-ARRAY for anything.
Lucid, IIM, Ibuki, and Symbolics Cloe make :ADJUSTABLE NIL arrays
non-adjustable in all cases, and make every array non-simple that CLTL
does not require to be simple.
Macintosh Allegro Common Lisp v1.2 makes :ADJUSTABLE NIL arrays
non-adjustable in all cases, makes all arrays of rank other than 1
non-simple (violating point 1), and makes every array non-simple that
CLTL does not require to be simple.
Cost to Implementors:
The change to ADJUSTABLE-ARRAY-P is easy. The change to ADJUST-ARRAY may
involve some complex coding but should not be a large task. No changes
are required to anything connected with SIMPLE-ARRAY.
Cost to Users:
None in code that does not call ADJUSTABLE-ARRAY-P. This is a fully
upward-compatible change from the user's standpoint.
Benefits:
Programs that use simple arrays and/or adjust arrays will be easier
to port, as the language specification for these features will be
clearer. More programs will be able to call ADJUST-ARRAY, as its use
will not be restricted to a subset of arrays.
Non-Benefits:
Users who expect adjusting arrays created with :ADJUSTABLE NIL to signal
an error would not get the desired signal. A few programs might have
porting problems due to variation among implementations of whether the
result of ADJUST-ARRAY is EQ to the first argument.
Aesthetics:
Most people believe the status quo is unaesthetic. Having an aspect of
the language more clearly specified is an aesthetic improvement.
Allowing ADJUST-ARRAY on all arrays is an aesthetic improvement.
Discussion:
There are at least 110 messages of discussion preceding this version of the
proposal. It does not seem feasible to summarize them here.
Dick Gabriel, Dave Moon, and Guy Steele support this proposal.
∂11-Jun-89 1226 CL-Cleanup-mailer issue DYNAMIC-EXTENT-FUNCTION, version 2
Received: from cs.utah.edu by SAIL.Stanford.EDU with TCP; 11 Jun 89 12:26:05 PDT
Received: from defun.utah.edu by cs.utah.edu (5.61/utah-2.1-cs)
id AA04581; Sun, 11 Jun 89 13:26:28 -0600
Received: by defun.utah.edu (5.61/utah-2.0-leaf)
id AA20929; Sun, 11 Jun 89 13:26:25 -0600
From: sandra%defun@cs.utah.edu (Sandra J Loosemore)
Message-Id: <8906111926.AA20929@defun.utah.edu>
Date: Sun, 11 Jun 89 13:26:24 MDT
Subject: issue DYNAMIC-EXTENT-FUNCTION, version 2
To: cl-cleanup@sail.stanford.edu
I apologize for taking so long to finish this up -- I keep getting
distracted with "real work" lately....
Forum: CLEANUP
Issue: DYNAMIC-EXTENT-FUNCTION
References: Scope and Extent
Issue DYNAMIC-EXTENT
Category: ADDITION
Edit history: 04-Apr-89, Version 1 by Loosemore
11-Jun-89, Version 2 by Loosemore
Problem Description:
Proposal DYNAMIC-EXTENT:NEW-DECLARATION, passed at the March 89
meeting, provides a mechanism for declaring that the values of
variables have only dynamic (rather than indefinite) extent. It
would be useful to have similar functionality to indicate that
functional bindings may have only dynamic extent. (For example,
this would permit compilers to stack-allocate closures.)
Proposal (DYNAMIC-EXTENT-FUNCTION:EXTEND):
Extend the DYNAMIC-EXTENT declaration to accept arguments that are
lists of the form (FUNCTION <name>) where <name> is a function name,
as well as symbols.
A (FUNCTION <name>) list appearing in a DYNAMIC-EXTENT declaration is
used to declare that the lexically visible functional binding of <name>
has dynamic extent. Except for the interpretation of <name> as the
name of a function instead of the name of a variable, such a declaration
otherwise has semantics that are identical to those already described
in proposal DYNAMIC-EXTENT:NEW-DECLARATION.
Rationale:
This permits a programmer to offer advice to an implementation about
what functions may be stack-allocated for efficiency.
It may be difficult or impossible for a compiler to infer this
same information statically.
Current Practice:
JonL says that Lucid's compiler can stack-allocate closures, but they
have no mechanism for programmers to give the compiler permission to
do so.
HPCL-I has an UPWARD-CLOSURES declaration that pervasively affects
all closures created within the scope of the declaration.
The Symbolics Genera compiler can often infer when functions can be
implemented to have dynamic extent. Also, if a function has a
SYS:DOWNWARD-FUNCTION declaration in front of its body, then the
function is implemented with dynamic extent regardless of whether
the compiler thinks all uses are "downward". (This declaration is
rather peculiar because its scope is actually larger than the lambda
expression containing the declaration; implementationally, it's the
surrounding function definition.)
Cost to Implementors:
No cost is forced since implementations are permitted to simply
ignore the DYNAMIC-EXTENT declaration.
Cost to Users:
None. This change is upward compatible.
There may be some hidden costs to debugging using this declaration (or any
feature which permits the user to access dynamic extent objects without
the compiler proving that they are appropriate). If the user misdeclares
something and returns a pointer into the stack (or stores it in the heap),
an undefined situation may result and the integrity of the Lisp storage
mechanism may be compromised. Debugging these situations may be tricky,
but users who have asked for this feature have indicated a willingness
to deal with such costs. Nevertheless, the perils should be clearly
documented and casual users should not be encouraged to use this
declaration.
Cost of Non-Adoption:
Some portable code would be forced to run more slowly (due to
GC overhead), or to use non-portable language features.
Benefits:
The cost of non-adoption is avoided.
Aesthetics:
This declaration allows a fairly low level optimization to work
by asking the user to provide only very high level information.
The alternatives (sharpsign conditionals, some of which may
lead to more bit-picky abstractions) are far less aesthetic.
Discussion:
Loosemore supports DYNAMIC-EXTENT-FUNCTION:EXTEND.
This proposal does not attempt to address the issue of specifying
dynamic extent for anonymous closures (which is really a special case
of the more general problem of specifying dynamic extent for unnamed
objects of any type). It's possible, although often awkward, to
restructure the program to give the object a name and explicitly
identify its extent.
One possible solution to the problem of dynamic extent for anonymous
lambdas would be to clarify that a reference to a closed-over variable
or function appearing lexically within a FUNCTION form is enough to
cause its value to be "saved" when the FUNCTION form is executed,
regardless of whether or not that reference is actually executed when
the resulting function is called. Then, if all of the closed-over
functions and variables referenced within a closure are declared to
have dynamic extent, the closure could be assumed to have dynamic
extent as well. (More precisely, its maximum extent would be the
intersection of the extents of the closed-over functions and
variables.)
-------
∂13-Nov-88 1534 CL-Cleanup-mailer Issue: FORMAT-ROUNDING (Version 1)
Received: from STONY-BROOK.SCRC.Symbolics.COM (SCRC-STONY-BROOK.ARPA) by SAIL.Stanford.EDU with TCP; 13 Nov 88 15:34:07 PST
Received: from BOBOLINK.SCRC.Symbolics.COM by STONY-BROOK.SCRC.Symbolics.COM via CHAOS with CHAOS-MAIL id 491649; Sun 13-Nov-88 18:34:11 EST
Date: Sun, 13 Nov 88 18:33 EST
From: Kent M Pitman <KMP@STONY-BROOK.SCRC.Symbolics.COM>
Subject: Issue: FORMAT-ROUNDING (Version 1)
To: CL-Cleanup@SAIL.Stanford.EDU
Message-ID: <881113183353.8.KMP@BOBOLINK.SCRC.Symbolics.COM>
Kathy distributed this in hardcopy at the last X3J13 meeting.
I'm agnostic on it for now, but wanted to get it into the
archives. -kmp
-----
Issue: FORMAT-ROUNDING
References: FORMAT (p. 390)
Category: CHANGE
Edit history: 5-OCT-88, Version 1 by Chapman
Problem Description:
For the ~F FORMAT directive, the implementation can either round up
or down when the rounding produces printed values equidistant from
the scaled value of the argument.
Proposal (FORMAT-ROUNDING:SPECIFY)
Specify that the implementation rounds up when the rounding produces
printed values equidistant from
the scaled value of the argument.
Rationale:
This change allows predictible results to occur when
rounding occurs.
Current Practice:
Adoption Cost:
Mininal.
Benefits:
This clarification will assist users in writing portable code.
Conversion Cost:
Minimal.
Aesthetics:
None.
Discussion:
∂08-Oct-88 1751 X3J13-mailer DRAFT Issue: HASH-TABLE-PRINTED-PREPRESENTATION (Version 2)
Received: from Xerox.COM by SAIL.Stanford.EDU with TCP; 8 Oct 88 17:51:32 PDT
Received: from Semillon.ms by ArpaGateway.ms ; 08 OCT 88 17:44:14 PDT
Date: 8 Oct 88 17:43 PDT
Sender: masinter.pa@Xerox.COM
Subject: DRAFT Issue: HASH-TABLE-PRINTED-PREPRESENTATION (Version 2)
From: cl-cleanup@sail.stanford.edu
To: x3j13@sail.stanford.edu
REPLY-TO: cl-cleanup@sail.stanford.edu
line-fold: NO
cc: Masinter.pa@Xerox.COM
Message-ID: <881008-174414-2396@Xerox>
The issues listed under Additional Comments still have not been resolved.
!
Status: DRAFT (see Additional Comments)
Issue: HASH-TABLE-PRINTED-PREPRESENTATION
Category: ENHANCEMENT
Edit history: 23-May-88, Version 1 by Touretzky
8-Jun-88, Version 2 by Masinter (as per cl-cleanup discussion)
Description:
Hash tables are currently second-class data structures when compared to
lists, vectors, and structures, because they have no READable printed
representation. This proposal introduces a #H reader syntax for hash
tables and a switch to control when hash tables will be printed this way.
Proposal (HASH-TABLES-PRINTED-REPRESENTATION:#H-NOTATION) :
1) Introduce the following reader notation for hash tables:
#nH(type (k1 v1) (k2 v2) ...)
"n" is the size of the table; it is analagous to the :size argument to
MAKE-HASH-TABLE. If omitted, the system picks some reasonable size.
"type" is one of EQ, EQL, or EQUAL. If omitted it defaults to EQL.
The (ki vi) pairs consist of a key and a value. There may be any number of
such pairs, including zero. Order is not significant. It is an error for
two keys to be identical (using the EQ, EQL, or EQUAL test, as
appropriate.)
2) Introduce a switch called *PRINT-HASH* whose initial value is
implementation-dependent. If *PRINT-HASH* is T, hash tables are printed
using the #H syntax (with all optional components supplied), subject to the
usual limits imposed by *PRINT-LEVEL* and *PRINT-LENGTH*. If *PRINT-HASH*
is NIL, hash tables are printed using the current #<HASH-TABLE ...> syntax.
Rationale:
This is a useful upward compatible extension (except in
implementations that have usurped #H for other purposes), with very
low adoption cost.
Cost to Implementors:
A simple change to PRIN1 and the pretty printer. Most of the code
will be similar to existing routines for printing vectors in #()
notation and arrays in #nA() notation. The reader would change to
read this notation.
Cost to Users:
Small. Programs that want to control all *PRINT- parameters will need
to know about yet another parameter.
Benefits:
This proposal makes hash tables first class objects. If
*PRINT-HASH* is T, their contents become visible in all the normal ways,
e.g., if FOO is bound to a hash table object, then typing FOO to a
read-eval-print loop will display the contents of the hash table. Hash
table contents may also be displayed by TRACE if the table is passed as an
argument; they may also be displayed by the debugger. Finally, hash tables
may be appear as literal objects in programs and be read or written to files.
Current practice:
We know of no current implementations of this proposal.
Although some implementations allow the user to see hash table contents
with DESCRIBE or INSPECT, not all do. CMU Common Lisp's DESCRIBE, for
example, does not show hash table contents. This reinforces the need for
a standard #H notation to guarantee that users can manipulate a hash table
as easily as a vector, array, or structure.
Discussion:
Several alternatives have been suggested for the syntax of #H.
- preferred notation: #H(EQL (FOO 37) (BAR 42))
- dotted pair notation: #H(EQL (FOO . 37) (BAR . 42))
- property list: #H(EQL FOO 37 BAR 42)
- pseudo-structure: #S(HASH-TABLE TYPE EQL SIZE 20 INITIAL-CONTENTS ((FOO 37) (BAR 42)))
One problem with the currently proposed #H notation is that it provides no
way to specify a rehash-size or rehash-threshold. This should not be a
fatal flaw, though. The #() notation is also incomplete: it cannot
indicate whether the vector has a fill pointer, nor can it show when the
element-type is something more specific than T. The latter problem is also
shared by #nA notation. Some object that the fact that #A is flawed is no
reason to introduce the same flaw elsewhere.
This prompted yet another proposal:
#[size]H([type] [rehash-size] [rehash-threshold] (ki vi)*)
e.g. #65H(EQL 101 65 (FOO 37) (BAR 42))
along with alternative settings for *PRINT-HASH*, NIL, T, :BRIEF, where the
latter would leave out all of the options.
- - - - Additional Comments - - - - -
you still can't
call the objects "first class" if the printed representation cannot be
read in as an equivalent copy; and the fact that CL has some other datatypes
that aren't "first class" doesn't argue for doing something substandard
for hash-tables.
One problem with the currently proposed #H notation is that it provides
no way to specify a rehash-size or rehash-threshold. This should not be
a fatal flaw, though. The #() notation is also incomplete: it cannot
indicate whether the vector has a fill pointer, nor can it show when the
element-type is something more specific than T. The latter problem is
also shared by #nA notation.
I think this is a fatal flaw. The fact that *some* complex classes of
arrays also share this fatal flaw is no argument for retaining it. It
is still the case that simple arrays of the more common element types
do not have the flaw; and several years ago there was some discussion
on how to fix other manifestations of the flaw on multi-dimensional arrays.
∂20-Mar-89 1241 CL-Cleanup-mailer Issue: HASH-TABLE-SIZE (version 1)
Received: from STONY-BROOK.SCRC.Symbolics.COM (SCRC-STONY-BROOK.ARPA) by SAIL.Stanford.EDU with TCP; 20 Mar 89 12:41:39 PST
Received: from EUPHRATES.SCRC.Symbolics.COM by STONY-BROOK.SCRC.Symbolics.COM via CHAOS with CHAOS-MAIL id 560978; Mon 20-Mar-89 11:55:18 EST
Date: Mon, 20 Mar 89 11:55 EST
From: David A. Moon <Moon@STONY-BROOK.SCRC.Symbolics.COM>
Subject: Issue: HASH-TABLE-SIZE (version 1)
To: CL-Cleanup@sail.stanford.edu
cc: chapman%aitg.DEC@decwrl.dec.com
Message-ID: <19890320165503.1.MOON@EUPHRATES.SCRC.Symbolics.COM>
This issue came up while reviewing section 2.2 of the draft standard.
Does anyone object if I mail this to X3J13 and bring it up at the
March meeting? I couldn't find any sign that it has already been addressed.
Issue: HASH-TABLE-SIZE
References: CLtL p.283
Category: CLARIFICATION
Edit history: Version 1, 20-Mar-89, by Moon
Problem description:
CLtL contradicts itself on the meaning of the :SIZE argument to
MAKE-HASH-TABLE. At the top of p.283, it says that the size is "the
maximum number of entries it can hold. Usually the actual capacity of
the table is somewhat less." At the bottom of the page it says "this
argument serves as a hint to the implementation of approximately how
many entries you intend to store." So does the :SIZE intended to be the
actual capacity of the table, or the amount of storage allocated to hold
the table. For example, if the implementation of hash tables is
designed for a loading of 65%, and the user specifies :SIZE 100, does
the table returned have space allocated for 100 entries, so that it
overflows and becomes bigger when 65 entries are inserted, or does the
table have space allocated for 154 entries, so that it overflows and
becomes bigger when 100 entries are inserted?
Proposal (HASH-TABLE-SIZE:INTENDED-ENTRIES):
Believe the bottom of p.283 rather than the top. The :SIZE argument
is approximately the number of entries that can be inserted without
the table having to grow.
Rationale:
The bottom of p.283 is user-oriented, the top is implementation-oriented.
User-oriented seems more appropriate.
Current practice:
Symbolics Genera 7.4 adheres to HASH-TABLE-SIZE:INTENDED-ENTRIES.
Other implementations were not surveyed.
Cost to Implementors:
At worst adding a multiplication to MAKE-HASH-TABLE.
Cost to Users:
Probably none, but it is hard to predict.
Cost of non-adoption:
Implementations will probably vary in which of the two interpretations
they believe. The language standard will not be self-consistent.
Performance impact:
None of any significance.
Benefits/Esthetics:
More self-consistent language.
Discussion:
None.
∂06-Feb-89 1354 CL-Cleanup-mailer Issue: IGNORE-VARIABLE (Version 1)
Received: from STONY-BROOK.SCRC.Symbolics.COM (SCRC-STONY-BROOK.ARPA) by SAIL.Stanford.EDU with TCP; 6 Feb 89 13:53:55 PST
Received: from BOBOLINK.SCRC.Symbolics.COM by STONY-BROOK.SCRC.Symbolics.COM via CHAOS with CHAOS-MAIL id 534454; Mon 6-Feb-89 16:51:58 EST
Date: Mon, 6 Feb 89 16:51 EST
From: Kent M Pitman <KMP@STONY-BROOK.SCRC.Symbolics.COM>
Subject: Issue: IGNORE-VARIABLE (Version 1)
To: CL-Cleanup@SAIL.Stanford.EDU
Message-ID: <890206165129.4.KMP@BOBOLINK.SCRC.Symbolics.COM>
This was already being discussed under DESTRUCTURING-BIND. I'm
spawning a new issue name to help partition/focus the discussion.
-----
Issue: IGNORE-VARIABLE
Forum: Cleanup
References: IGNORE declaration (p160)
Category: CHANGE
Edit history: 06-Feb-89, Version 1 by Pitman
Status: For Internal Discussion
Problem Description:
Many users of Symbolics Common Lisp (under Symbolics Genera) have grown
used to the variable named `IGNORE' receiving special treatment and have
complained that Common Lisp does not offer the same `feature.'
Proposal (IGNORE-VARIABLE:SPECIAL-TREATMENT):
1. Define that the variable IGNORE (in the LISP package only) is always
implicitly ignored. It is an error to use the variable IGNORE.
A declaration of (IGNORE IGNORE) is permitted, but redundant.
2. Permit the variable IGNORE (or any variable declared ignored with the
IGNORE declaration) to be a duplicated variable in a binding list.
Rationale:
1. Greater syntactic conciseness.
This is effectively current practice in some implementations.
2. If only one variable is going to be dignified in this way, it must
be possible to repeat it. If it makes sense to repeat IGNORE, it makes
sense to repeat any name declared IGNORE.
Test Case:
CLtL Proposed
#1: (DEFUN FOO (IGNORE) T) may warn ok
#2: (DEFUN FOO (IGNORE) IGNORE) ok is error
#3: (DEFUN FOO (IGNORE IGNORE) T) is error ok
#4: (DEFUN FOO (IGNORE IGNORE) IGNORE) is error is error
#5: (DEFUN FOO (X X) (DECLARE (IGNORE X)) T) is error ok
#6: (DEFUN FOO (X X) (DECLARE (IGNORE X)) X) is error is error
Current Practice:
Symbolics Genera currently treats all variables with the name "IGNORE"
(regardless of package) as ignored variables and complains if you try
to use them. It provides no way to turn off the `feature.'
Symbolics Cloe does not special-case variables named "IGNORE".
Cost to Implementors:
Small.
Cost to Users:
Although the change is technically incompatible, very few programs would
be likely to require change, since a name like IGNORE is unlikely to have
been used for anything other than an ignored variable.
Cost of Non-Adoption:
Some programs would be clumsier to write.
Benefits:
Less verbose code in some cases.
Existing code in some dialects would not require translation.
Aesthetics:
At first brush, some might argue that this makes a bad special case in
the treatment of symbols as variables. However, on closer inspection,
it's clear that the language already treats some symbols magically:
- Symbols on the keyword package are treated specially with respect
to their values.
- Symbols like *PRINT-LEVEL*, etc. have pre-defined special meanings
and cannot be bound without knowing their conventions.
- Symbols like NIL and MOST-POSITIVE-FIXNUM have pre-defined values
and cannot be bound at all.
The thing which makes these changes palatable is that none of them is
based on the symbol's name. As such, anyone unhappy with this treatment
can simply make a new package that shadows the symbol and the symbol will
be treated normally if that is what is desired.
Discussion:
When translating Macsyma from Zetalisp to Common Lisp a few years ago,
the use of Maclisp/LispM-style IGNORE was so pervasive that Pitman
ultimately just did (PROCLAIM '(SPECIAL IGNORE)) to muffle all the
would-be warnings. Except for the situation of duplicate variable names,
this is a totally portable solution, but it is -not- efficient.
Some people have suggested that (PROCLAIM '(IGNORE IGNORE)) should
work, but others argue that PROCLAIM should not, in general, be assumed
to affect lexical references.
Moon and Pitman mildly support option SPECIAL-TREATMENT.
Symbolics Common Lisp users have often expressed a desire to see this
feature in portable Common Lisp.
Some Symbolics users have complained specifically about the
incompatibility between Symbolics Genera and Symbolics Cloe in their
treatment of the variable IGNORE. Although Cloe has CLtL to fall back
on for justification, Cloe is invariably seen as the "bad guy" in their
reports because it stubbornly keeps them from getting the functionality
they want on the basis of what to them seems an irrelevant religious
or philosophical concern.
Some people might want to see a symmetric treatment of an ignored
variable in SETQ and MULTIPLE-VALUE-SETQ.
Maclisp used to permit NIL to denote an ignored variable. In general
this worked out well. The disadvantages to using NIL over using
IGNORE are:
- The common kind of macroexpansion error where NIL is substituted
for a useful value is harder to detect.
- There would be a potential conflict between the any meaning for NIL
in its `proper role' at the end of a dotted list and any opposing
meaning that a destructuring operation (eg, DEFMACRO, LOOP,
DESTRUCTURING-BIND) might want to assign to a dotted variable in
that position.
Note well: Experience with Symbolics Genera shows that special case
treatment based only on the name -- or some substring thereof -- and
not also on the package is a bad idea because there is no way to get
away from the feature if you don't like it. It is therefore an important
aspect of this proposal that only LISP:IGNORE is affected, and that an
EQ-test (and not a substring comparison) is the criterion for
identifying this magic variable.
∂11-Apr-89 0714 CL-Cleanup-mailer Issue: LOAD-TRUENAME (Version 4)
Received: from STONY-BROOK.SCRC.Symbolics.COM by SAIL.Stanford.EDU with TCP; 11 Apr 89 07:14:20 PDT
Received: from BOBOLINK.SCRC.Symbolics.COM by STONY-BROOK.SCRC.Symbolics.COM via CHAOS with CHAOS-MAIL id 575061; Tue 11-Apr-89 10:12:53 EDT
Date: Tue, 11 Apr 89 10:12 EDT
From: Kent M Pitman <KMP@STONY-BROOK.SCRC.Symbolics.COM>
Subject: Issue: LOAD-TRUENAME (Version 4)
To: CL-Cleanup@SAIL.Stanford.EDU
Message-ID: <890411101208.3.KMP@BOBOLINK.SCRC.Symbolics.COM>
New version to accomodate Moon's comments.
I changed the Example a bit.
I added a paragraph to the Discussion.
Everything else is the same.
I'm hoping this is a final version.
-kmp
-----
Issue: LOAD-TRUENAME
Forum: Cleanup
References: LOAD (p426), PROVIDE (p188), REQUIRE (p188),
Issue REQUIRE-PATHNAME-DEFAULTS
Category: ADDITION
Edit history: 13-Mar-89, Version 1 by Pitman
29-Mar-89, Version 2 by Moon (add -PATHNAME vars)
10-Apr-89, Version 3 by Pitman (clarify v2)
11-Apr-89, Version 4 by Pitman (merge Moon's v3 comments)
Problem Description:
It is difficult to construct sets of software modules which work
together as a unit and which port between different implementations.
REQUIRE and PROVIDE were intended to provide this level of support
but have `failed' to be portable in practice.
Typical user configurations involve a `system definition' file which
loads the modules of a `system' (collection of software modules).
Among the specific problems which arise are:
- File system types may vary. Different file syntax must be used for
each site.
- Even with the same Lisp implementation and host file system type,
the directory in which a software system resides may differ from
delivery site to delivery site.
- Multiple `copies' of the same system may reside in different
directories on the same machine.
Proposal (LOAD-TRUENAME:NEW-PATHNAME-VARIABLES):
Introduce new variables:
*LOAD-TRUENAME* [Variable]
This special variable is initially NIL, but is bound by LOAD to
hold the truename of the pathname of the file being loaded.
*COMPILE-FILE-TRUENAME* [Variable]
This special variable is initially NIL, but is bound by
COMPILE-FILE to hold the truename of the pathname of the file
being compiled.
*LOAD-PATHNAME* [Variable]
This special variable is initially NIL but is bound by LOAD
to hold a pathname which represents the filename given as the
first argument to LOAD merged against the defaults.
That is, (PATHNAME (MERGE-PATHNAMES arg1)).
*COMPILE-FILE-PATHNAME* [Variable]
This special variable is initially NIL but is bound by COMPILE-FILE
to hold a pathname which represents the filename given as the
first argument to COMPILE-FILE merged against the defaults.
That is, (PATHNAME (MERGE-PATHNAMES arg1)).
Example:
------ File SETUP ------
(IN-PACKAGE "MY-STUFF")
(DEFMACRO COMPILE-TRUENAME () `',*COMPILE-FILE-TRUENAME*)
(DEFVAR *MY-COMPILE-TRUENAME* (COMPILE-TRUENAME) "Just for debugging.")
(DEFVAR *MY-LOAD-PATHNAME* *LOAD-PATHNAME*)
(DEFUN LOAD-MY-SYSTEM ()
(DOLIST (MODULE-NAME '("FOO" "BAR" "BAZ"))
(LOAD (MERGE-PATHNAMES MODULE-NAME *MY-LOAD-PATHNAME*))))
------------------------
(LOAD "SETUP")
(LOAD-MY-SYSTEM)
Rationale:
This satisfies the most common instances of the frequently reported
problem in the Problem Description.
The ...-TRUENAME* variables are useful to tell the real file being
loaded.
The ...-PATHNAME* variables are useful to find information about
the original link names or logical device names mentioned in the
pathname to be opened but no longer reflected in the truename.
Note that it is not adequate to just have the -PATHNAME* variables
since TRUENAME on these pathnames might not yield the value of the
-TRUENAME* variables if the file has been deleted or protected
since the open occurred (in some implementations).
Current Practice:
Wide variation.
In some implementations, calling LOAD binds or sets
*DEFAULT-PATHNAME-DEFAULTS* so that pathnames named in a file being
LOADed will default to being `nearby.'
Some implementations provide special variables that are similar or
identical to one or both of those proposed.
Some implementations have a way to represent the pathname for the
current working directory, and make the default pathname default
to that, so that loading without specifying a default again tends to
get `nearby' files.
None of these techniques is portable, unfortunately, because there
is no agreement.
Cost to Implementors:
Very small.
Cost to Users:
None. This change is upward compatible.
Cost of Non-Adoption:
Continued difficulty for anyone trying to put a system of modules
in a form where they can be conveniently delivered using portable code.
Benefits:
The cost of non-adoption is avoided.
Aesthetics:
Negligible.
Discussion:
Touretzky raised the issue most recently on Common-Lisp. A number
of people immediately jumped on the bandwagon, indicating it was
important to them, too.
Pitman made three suggestions in response, of which the above is
the first. The others included:
2. Variables *LOAD-TRUENAMES* and *COMPILE-FILE-TRUENAMES* which hold
lists of the truenames of all files being loaded or compiled,
respectively, during the dynamic invocation of LOAD and COMPILE-FILE.
3. Variable *LOAD-OR-COMPILE-FILE-TRUENAMES* which holds a list like
((LOAD truename) (COMPILE-FILE truename) ...)
during the dynamic invocation of LOAD and COMPILE-FILE.
Touretzky responded:
``I like KMP's proposals. I like the second one best: have separate
variables for files being loaded and files being compiled, and use
them to maintain a stack so we can see the nesting of loads within
files.''
Pitman ultimately chose to present the first rather than the second
because it seemed simpler, easier to explain, and more likely to
pass at this late date.
Other suggestions which were considered discarded were:
a. Provide just variables *LOAD-STREAM* and *COMPILE-FILE-STREAM*.
Then PATHNAME and TRUENAME could be used to yield the
information contained in the -PATHNAME* and -TRUENAME* variables
of the proposal above.
b. Like (a), but call both variables *STANDARD-INPUT*. That is,
say that LOAD and COMPILE-FILE bind *STANDARD-INPUT* to the
stream being loaded.
There were a number of pitfalls with this approach which all center
around the way it invites the user to do other operations besides
PATHNAME and TRUENAME. Not only would some people be confused by
the difference between the characteristics of *LOAD-STREAM* for
compiled and interpreted files, but also even with interpreted
streams, the actual position of the stream pointer at the time of
execution of the forms contained in the file could vary between
implementations in a way that became a lurking portability barrier.
Since the observed user need which spawned this discussion was for
a way to tell what file was being loaded and not for a way to
manipulate the stream, it seemed best to just go with the variables
that addressed that specific need--fewer pitfalls and more perspicuous
code are likely to result.
∂15-Jun-89 0921 X3J13-mailer issue MACRO-CACHING, version 3
Received: from cs.utah.edu by SAIL.Stanford.EDU with TCP; 15 Jun 89 09:21:25 PDT
Received: from defun.utah.edu by cs.utah.edu (5.61/utah-2.1-cs)
id AA15952; Thu, 15 Jun 89 10:21:37 -0600
Received: by defun.utah.edu (5.61/utah-2.0-leaf)
id AA23878; Thu, 15 Jun 89 10:21:35 -0600
Date: Thu, 15 Jun 89 10:21:35 -0600
From: sandra%defun@cs.utah.edu (Sandra J Loosemore)
Message-Id: <8906151621.AA23878@defun.utah.edu>
To: x3j13@sail.stanford.edu
Subject: issue MACRO-CACHING, version 3
Reply-To: cl-compiler@sail.stanford.edu
This issue was distributed prior to the March meeting but was tabled so
that we could produce a simplified writeup. Here it is.
Issue: MACRO-CACHING
Forum: Compiler
References: 8.2 Macro Expansion (CLtL pp151-152),
Issues PACKAGE-CLUTTER, LISP-SYMBOL-REDEFINITION,
CONSTANT-MODIFICATION,
and MACRO-ENVIRONMENT-EXTENT
Category: Clarification
Edit history: 31-Jan-89, Version 1 by Pitman
11-Mar-89, Version 2 by Loosemore (add discussion)
30-May-89, Version 3 by Loosemore (simplify, rewrite)
Status: Ready for release
Problem Description:
The description of *MACROEXPAND-HOOK* in CLtL states that its purpose
is "to facilitate various techniques for improving interpretation
speed by caching macro expansions". However, there is no portable way
to correctly perform such caching.
Caching by displacement won't work because the same (EQ) macro call
form may appear in distinct lexical contexts. In addition, the macro
call form may be a read-only constant.
Caching by table lookup won't work because such a table would have to
be keyed by both the macro call form and the environment, and proposal
MACRO-ENVIRONMENT-EXTENT:DYNAMIC (passed at the March 1989 meeting)
states that macro environments are permitted to have only dynamic
extent.
Caching by storing macro call forms and expansions within the
environment object itself would work, but there are no portable
primitives that would allow users to do this.
Proposal (MACRO-CACHING:DISALLOW):
(1) Remove the suggestion that *MACROEXPAND-HOOK* be used for caching
macroexpansions. Instead, suggest that it might be used for
debugging purposes.
(2) Clarify that although there is no correct portable way to use
*MACROEXPAND-HOOK* to cache macro expansions, there is no
requirement that an implementation call the macro expansion
function more than once for a given form and lexical environment.
Rationale:
Item (1) fixes the description of what *MACROEXPAND-HOOK* is for, from
the point of view of a user. Item (2) allows implementors to use
other, correct but nonportable techniques for caching macro
expansions.
Proposal (MACRO-CACHING:DEPRECATE):
This is the same as DISALLOW, but also deprecate *MACROEXPAND-HOOK*.
Rationale:
Since *MACROEXPAND-HOOK* has now been shown to be unusable for its
original stated purpose, it is of questionable usefulness.
Test Case:
;; #1: File compiling this definition in some implementations will produce
;; a definition that returns read-only list structure. The call to EVAL
;; on the result must not try to modify the read-only structure during
;; macroexpansion. [See issue CONSTANT-MODIFICATION.]
(DEFUN READ-ONLY-FOO () '(MACROLET ((FOO (&WHOLE FORM) (+ 1 1))) (FOO)))
(EVAL (READ-ONLY-FOO))
=> 2
;; #2: This constructs a form and then uses it in two places in another
;; constructed form. Each of the uses is in a different lexical
;; contour, so must be expanded differently.
(LET ((FOO (LIST 'FOO)))
(EVAL `(LIST (MACROLET ((FOO (&WHOLE FORM) '(+ 1 1))) ,FOO)
(MACROLET ((FOO (&WHOLE FORM) '(+ 1 2))) ,FOO))))
=> (2 3)
;; #3: This is effectively the same thing but involves a MACROLET
;; shadowing a DEFMACRO rather than two MACROLETs, since some
;; implementations might only be caching expansions that come
;; from DEFMACRO.
(DEFMACRO FOO (&WHOLE FORM) '(+ 1 1))
(LET ((FOO (LIST 'FOO)))
(EVAL `(LIST ,FOO (MACROLET ((FOO (&WHOLE FORM) '(+ 1 2))) ,FOO))))
=> (2 3)
Current Practice:
Symbolics Genera does not use displacing or table caching in either
the interpreter or compiler.
Symbolics Cloe, a compiled only implementation, uses table caching
to boost compilation by a little. Running the test cases above turned
up a bug (in test case #3), which is now in the process of being fixed.
[The fact that a bug was turned up in code written by a CL implementor
is an existence proof that the potential for trouble was not imagined.]
The TI Explorer evaluator does displacement of macros, but is careful
to correctly handle the cases exemplified in test cases #1 and #2.
It does not do the right thing for #3, but that is a bug that can
fairly easily be fixed.
Cost to Implementors:
This proposal is upward compatible with correct implementations.
Cost to Users:
There is no cost to users, unless they were using semantically invalid
or nonportable caching techniques. Nonportable caching techniques might
continue to work in some implementations.
Cost of Non-Adoption:
Continued confusion about the purpose of *MACROEXPAND-HOOK* and the
validity of macro caching techniques.
Benefits:
The misleading description of *MACROEXPAND-HOOK*'s purpose is
removed.
Aesthetics:
Most people agree that macro caching techniques are only supposed to
improve speed without affecting semantics. This proposal is only
intended to underscore that necessary truth. Insofar as this is only
a clarification, it presumably has no significant aesthetic impact.
Discussion:
∂22-Mar-89 1054 CL-Cleanup-mailer PRETTY-PRINT-INTERFACE, version 4
Received: from life.ai.mit.edu by SAIL.Stanford.EDU with TCP; 22 Mar 89 10:54:13 PST
Received: from wheat-chex.ai.mit.edu by life.ai.mit.edu; Wed, 22 Mar 89 13:54:36 EST
Received: from localhost by wheat-chex.ai.mit.edu; Wed, 22 Mar 89 13:54:34 EST
Date: Wed, 22 Mar 89 13:54:34 EST
From: dick@wheaties.ai.mit.edu
Message-Id: <8903221854.AA10669@wheat-chex.ai.mit.edu>
To: cl-cleanup@sail.stanford.edu
Subject: PRETTY-PRINT-INTERFACE, version 4
Version 3 (by Guy Steele Jr) supersedes version 2 and is changed from
version 1 as follows: adds a functional interface to supplement the
interface through FORMAT, and reflects comments by Barrett and
Pierson.
Version 4 (by Dick Waters) is changed from version 3 as follows: The
short summary is updated to reflect the functional interface. The
functional interface is changed following suggestions made by Dave Moon.
The proposal is amended in a few minor ways to increase the
compatibility with variable width fonts. Additional discussion has been
added with regard to the advantages of XP with regard to handling
circularity detection and abbreviation, the interaction with CLOS, and
the extended type specifier CONS used by XP.
The document attached to version 1 has also been fully revised, but is
sent in a separate message due to mailer problems.
--Dick
Issue: PRETTY-PRINT-INTERFACE
References: Description of XP by Dick Waters (attached)
*PRINT-PRETTY* (CLtL p. 371)
WRITE (CLtL p. 382)
PPRINT (CLtL p. 383)
FORMAT (CLtL pp. 385-407)
FORMAT ~T directive (CLtL pp. 398-399)
FORMAT ~< directive (CLtL pp. 404-406)
Related issues:
Category: CLARIFICATION CHANGE ADDITION
Edit history: Version 1, 24-Feb-89 by Steele
Version 2, 15-Mar-89 by Steele and Waters
Version 3, 15-Mar-89 by Steele
Version 4, 22-Mar-89 by Waters
Problem description:
At present, Common Lisp provides no specification whatsoever of how
pretty-printing is to be accomplished, and no way for the user to control
it. In particular, there is no protocol by which a user can write a
print-function for a structure, or a method for PRINT-OBJECT, that will
interact smoothly with the built-in pretty-printer in a portable manner.
Proposal (PRETTY-PRINT-INTERFACE:XP):
Adopt the interfaces and protocols of the XP pretty-printer by Dick Waters,
described in full in the attached 12-page document. Here is a very brief
summary of the proposal.
New variables: *PRINT-DISPATCH*
*PRINT-RIGHT-MARGIN*
*DEFAULT-RIGHT-MARGIN*
*PRINT-MISER-WIDTH*
*PRINT-LINES*
*LAST-ABBREVIATED-PRINTING*
New functions: COPY-PRINT-DISPATCH
FILL-STYLE
LINEAR-STYLE
TABULAR-STYLE
CONDITIONAL-NEWLINE
LOGICAL-BLOCK-TAB
LOGICAL-BLOCK-INDENT
New macros: DEFINE-PRINT-DISPATCH
WITHIN-LOGICAL-BLOCK
LOGICAL-BLOCK-COUNT
LOGICAL-BLOCK-POP
New FORMAT directives: ~W ~_ ~I ~:T ~/name/ ~<...~:>
New # reader macro: #"..."
The function WRITE is extended to accept additional keyword arguments
:DISPATCH, :RIGHT-MARGIN, :LINES, and :MISER-WIDTH corresponding to the
first four of the new variables.
Examples: See attached document.
Rationale:
There ought to be a good user interface to the pretty printer.
This is the only proposal for which there is a portable implementation
that has seen extensive use and is being made freely available.
Current practice:
XP son of PP son of GPRINT son of PRINT* is the latest in a line of pretty
printers that goes back 13 years. All of these printers use essentially
the same basic algorithm and conceptual interface. Further, except for
PRINT*, which was implemented solely to satisfy the author's personal
needs, each of these printers has had extensive use. XP has been in
experimental use as the pretty printer in CMU Common Lisp for 6 months. PP
has been the pretty printer in DEC Common Lisp for the past 3 years. Prior
to three years ago, GPRINT was used for 2 years as the pretty printer in
DEC Common Lisp. In addition, GPRINT has been the pretty printer in
various generations of Symbolics Lisp for upwards of 5 years.
(See Waters R.C., "User Format Control in a Lisp Prettyprinter", ACM TOPLAS,
5(4):513--531, October 1983.)
Cost to Implementors:
A fair amount of effort (perhaps a few man-weeks at most).
Source code for XP is available to all comers from Dick Waters, and
the system is documented in great detail:
Waters, Richard C., "XP: A Common Lisp Pretty Printing System",
Artificial Intelligence Laboratory Technical Memo 1102,
Massachusetts Institute of Technology, Cambridge MA, March 1989.
Cost to Users: None (I think). This is an upward-compatible extension.
Cost of non-adoption: Continued inability for user print-functions
to interact with the pretty-printer in a useful and portable manner.
Performance impact: XP is claimed to be quite fast.
Benefits: User control of pretty-printing in a portable manner.
Aesthetics:
Using ~<...~:> may strike some as uncomfortably close in the syntactic
space of FORMAT directives to the existing ~<...~>. However, it is very
unlikely that both of these directives (pretty-print logical block and
columnar justification, respectively) will be used in the same call to
FORMAT. Previous versions of XP used ~!...~. instead of ~<...~:> but this
made FORMAT strings very difficult to read; it is preferable to have
a directive that looks like matching brackets of some sort.
Dan Pierson comments: You might mention that some people will undoubtedly
find piling more hair on FORMAT ugly (of course these same people may
well find FORMAT in general ugly :-)).
Discussion:
Zetalisp used ~:T to mean pixelwise tabulation, so the use of ~:T
suggested here may be a problem. If so, another suggestion for naming
this directive would be appropriate.
The ~/.../ directive is already in Zetalisp, and is not an idea new
to this proposal. However, it should be noted that the proposal for
~/.../ here is simpler than, and incompatible with, the current Zatalisp
practice.
Guy Steele and Dick Waters strongly support this proposal. (As an example,
Guy Steele has a portable simulator for Connection Machine Lisp, and would
like very much to have xappings and xectors pretty-print properly.)
Dan Pierson comments: You can add me to the list of strong supporters of
this proposal. While the proposal is long and complex, it is supported by
a long history of usage in several different Lisp environments. Unlike
some earlier members of this family, this version fits cleanly enough into
the rest of Common Lisp to warrant standardization.
The utility of *PRINT-LINES* becomes more obvious if it is pointed out
that Dick's pretty printers are implemented to print each line as it
is computed. This means that a small value for *PRINT-LINES* saves
significant time as well as output medium space. In fact, many people
find that a very pleasant REP loop is created by setting *PRINT-LINES*
to a value from 1-4, *PRINT-PRETTY* to T, and defining a short-name
function (say (PP*)) that funcalls *LAST-ABBREVIATED-PRINTING* with
abbreviation bound off. This is almost as fast and compact as, and
MUCH more readable than, a non-pretty-printing REP loop.
The advantages of compiled format strings (format functions) should be
brought out as benefits in their own right. The current proposal just
mentions them as a minor feature of XP.
At first this struck me a very cute end run around the failure of
STREAM-INFO, then I realized that one of the problems with STREAM-INFO
may have been that it was a standard at the wrong level. STREAM-INFO
permitted people to use XP, but not to count on it. This proposal
makes it possible to write portable code whose new data structures and
language elements print correctly in whatever Common Lisp environment
they're run in. [End of comments by Pierson]
It has been noted by Guy Steele that some places in the initial document
where it says that circularity detection is handled correctly, this is
true a fortiori following the decision on PRINT-CIRCLE-STRUCTURE.
However, Waters notes that the vote on PRINT-CIRCLE-STRUCTURE said
nothing about the handling of *PRINT-LEVEL*. Therefore, the fact that
XP handles *PRINT-LEVEL* correctly is an improvement.
In addition, PRINT-CIRCLE-STRUCTURE is also silent on what is supposed
to happen if a user program decomposes a list itself (e.g., with DOLIST
or ~{~}) rather than calling a print function. Assumedly *PRINT-CIRCLE*
etc. is not handled in this case. In contrast, if one uses
WITHIN-LOGICAL-BLOCK or ~<~:>, then *PRINT-CIRCLE*, *PRINT-LEVEL*, and
*PRINT-LENGTH* are all automatically handled correctly.
For example, (format nil "-~1{~A ~A ~A ~A ~A ~}-" '#1=(1 #1# 2 . #1#))?
produces "-1 #1=(1 #1# 2 . #1#) 2 1 #1=(1 #1# 2 . #1#) -"
even under PRINT-CIRCLE-STRUCTURE and
(format nil "-~1{~A ~}-" '#1=(1 #1# 2 . #1#))
cause infinite looping. However, in XP,
(format nil "-~:<~W ~W ~W ~W ~W~:>-" '#1=(1 #1# 2 . #1#))
produces "-#1=(1 #1# 2 . #1#)-".
This proves to be very useful when writing pretty printing functions for things.
Note also that ~<~:> supports *print-level* and *print-length* correctly.
All the same things can be said about the functional interface and using
WITHIN-LOGICAL-BLOCK rather than traversing a list yourself in some fashion.
All in all, Waters claims that PRINT-CIRCLE-STRUCTURE covers at most 1/4
of what XP does in support of *print-circle* and does not cover anything
of what XP does to support *print-level*, *print-length*, and
robustness in the face of malformed arguments. These are vital
features for writing printing functions that really work right all the time.
It has been noted by Dave Moon that things would be much more elegant if
DEFINE-PRINT-DISPATCH could be expressed directly as a CLOS DEFMETHOD
for an appropriate generic function. Dick Waters agrees with this.
However, DEFINE-PRINT-DISPATCH depends on type specifiers that are more
complex than the ones CLOS deals with and ones that do not have clear
subtype/supertype relationships, compensating for the latter problem by
supporting numerical priorities to disambiguate things. (The defaulting
behavior is a key feature of the pretty printer.) At the very least,
this means that DEFINE-PRINT-DISPATCH will not fit into CLOS in a simple way.
Given the problems, Moon suggests that "it does seem that right now it
might be best to keep a separate DEFINE-PRINT-DISPATCH macro, with the
idea that the expansion is implementation-dependent at the moment, but
might some day be changed to be defined to expand into DEFMETHOD. I
haven't looked to see whether any syntactic changes would be appropriate
to make that transition smoother."
(Waters also worries that the overhead needed to locate the right CLOS
method would seriously degrade the pretty printer, because the printer
has to do this for every part of every object printed. This dispatching is
currently done by very fast code that is tuned to take advantage of the
observed distribution of kinds of objects that have special pretty
printers attached to them. Even with this special purpose code,
dispatching takes a significant part of the pretty printer's time.)
Dave Moon also comments that it is not good to have something that looks
like a type specifier (i.e., the extended form of the CONS type specifier
used by DEFINE-PRINT-DISPATCH) and yet is not a real type specifier. He
suggests that we should either amend Common Lisp to accept the extended
form of the CONS type specifier, or stop having DEFINE-PRINT-DISPATCH
use it.
Waters supports any course of action that retains the use of the
extended CONS type specifier in conjunction with DEFINE-PRINT-DISPATCH.
However, he notes that the trade-off is clear. One could avoid the
complex CONS type specifier without any significant loss of
functionality by introducing a new macro DEFINE-LIST-PRINT-DISPATCH that
is identical to DEFINE-PRINT-DISPATCH except that it is relevant only to
conses and the type specifier applies to the CAR of the object to be
printed rather than to the object as a whole. However, this appears to
him to be significantly less elegant than the current approach.
-------------------- detailed documentation --------------------
The full description is too large to fit in with everything else in this
message. A fully correct version follows in a separate message. The
stuff below summarizes all of the changes from the full description in
version 1.
Amendments
To a considerable extent, the design of the XP interface is completely
neutral about the issue of variable- versus fixed- width fonts. In
particular, most of the discussion of how the formating proceeds either
talks about absolute positions of zero or talks about something being
in the same horizontal position as something else. These statements are
all font-independent. (Further, although Waters' current implementation
does not support variable-width fonts, the algorithms used could be
extended to support them without radical changes.)
Nevertheless, there are 9 places where users specify explicit
non-zero lengths: the variables *PRINT-RIGHT-MARGIN*,
*DEFAULT-RIGHT-MARGIN*, and *PRINT-MISER-WIDTH*, the numeric
arguments to ~T, ~I, and ~/tabular-style/ and their associated functions
LOGICAL-BLOCK-TAB, LOGICAL-BLOCK-INDENT, and TABULAR-STYLE.
It is proposed that all of these lengths be in the same units, and that
this unit be ems (the length of an "m" in the font currently being used
to output characters to the relevant output stream at the moment that
the command is encountered or a variable is consulted).
It is further proposed that users and implementors be advised to set
things up so that explicit lengths do not have to be specified. For
implementors, this means making streams smart enough that they know how
wide they are. (This avoids the use of *PRINT-RIGHT-MARGIN* and
*DEFAULT-RIGHT-MARGIN* in most situations.) For users, this means
relying on streams knowing their own widths (which is a good idea for
adaptability in any case) and using ~:I to specify indentations wherever
possible. Further, it should be noted that since *PRINT-MISER-WIDTH* is
essentially heuristic in nature, it does not matter if its value is only
an approximate length and users will only need to change the
value of *PRINT-MISER-WIDTH* in unusual situations. This leaves only
tabbing as an area where explicit lengths have to be specified on a
regular basis. Fortunately, approximate lengths are often acceptable in
this situation as well.
Functional Interface
The primary interface to operations for dynamically determining the
arrangement of output is provided through FORMAT. This is done,
because FORMAT strings are typically the most convenient way of
interacting with pretty printing. However, these operations have
nothing inherently to do with FORMAT per se. In particular, they can
also be accessed via the six functions and macros below.
WITHIN-LOGICAL-BLOCK (STREAM-SYMBOL LIST [Macro]
:PREFIX :PER-LINE-PREFIX :SUFFIX)
&BODY BODY
In the manner of ~<...~:>, this macro causes printing to be
grouped into a logical block. The value NIL is always returned.
STREAM-SYMBOL must be a symbol. If it is NIL, it is treated the same as
if it were *STANDARD-OUTPUT*. If it is T, it is treated the same as if
it were *TERMINAL-IO*. The run-time value of STREAM-SYMBOL must be a
stream. The logical block is printed into this destination stream.
The BODY can contain any arbitrary Lisp forms. Within the BODY,
STREAM-SYMBOL is bound to a special kind of stream that supports dynamic
decisions about the arrangement of output and then forwards the output
to the destination stream. All the standard printing functions (e.g.,
WRITE, PRINC, TERPRI) can be used to print output into STREAM-SYMBOL.
All and only the output sent to STREAM-SYMBOL is treated as being in the
logical block. (It is an error to send any output directly to the
underlying destination stream.)
The :SUFFIX, :PREFIX, and :PER-LINE-PREFIX must all be expressions that
(at run time) evaluate to strings. :SUFFIX (which defaults to the null
string) specifies a suffix that is printed just after the logical block.
:PREFIX specifies a prefix to be printed before the beginning of the
logical block. :PER-LINE-PREFIX specifies a prefix that is printed
before the block and at the beginning of each new line in the block. It
is an error for :PREFIX and :PRE-LINE-PREFIX to both be used. If neither
is used, a :PREFIX of the null string is assumed.
LIST is interpreted as being a list that BODY is responsible for
printing. If LIST does not (at run time) evaluate to a list, it is
printed using WRITE. If *PRINT-CIRCLE* is not NIL and LIST is a
circular reference to a cons, then an appropriate #n# marker is printed.
If *PRINT-LEVEL* is not NIL and the logical block is at a dynamic
nesting depth of greater than *PRINT-LEVEL* in logical blocks, # is
printed. If either of the three conditions above occures, the indicated
special output is printed on STREAM-SYMBOL and the BODY is skipped along
with the printing of the prefix and suffix. (If the BODY is
not responsible for printing a list, then the first two tests above can
be turned off by supplying NIL for the LIST argument.)
CONDITIONAL-NEWLINE KIND &OPTIONAL (STREAM *STANDARD-OUTPUT*) [Function]
CONDITIONAL-NEWLINE is the functional equivalent of ~_. STREAM (which
defaults to *STANDARD-OUTPUT*) follows the standard conventions for
stream arguments to printing functions (i.e., NIL stands for
*STANDARD-OUTPUT* and T stands for *TERMINAL-IO*). The KIND argument
specifies the style of conditional newline. It must be one of :LINEAR,
:FILL, :MISER, or :MANDATORY. If STREAM is a special stream bound by
WITHIN-LOGICAL-BLOCK, a conditional newline is sent to it. Otherwise,
CONDITIONAL-NEWLINE has no effect. The value NIL is always returned.
LOGICAL-BLOCK-INDENT RELATIVE-TO N &OPTIONAL (STREAM *STANDARD-OUTPUT*) [Function]
LOGICAL-BLOCK-INDENT is the functional equivalent of ~I. STREAM (which
defaults to *STANDARD-OUTPUT*) follows the standard conventions for
stream arguments to printing functions. N specifies the indentation in
ems. If RELATIVE-TO is :BLOCK, this indentation is relative to the
start of the enclosing block (as for ~I). If RELATIVE-TO is :CURRENT,
the indentation is relative to the current output position (as for ~:I).
It is an error for RELATIVE-TO to take on any other value. If STREAM is
a special stream bound by WITHIN-LOGICAL-BLOCK, LOGICAL-BLOCK-INDENT
sets the indentation in the innermost enclosing logical block.
Otherwise, LOGICAL-BLOCK-INDENT has no effect. The value NIL is always
returned.
LOGICAL-BLOCK-TAB KIND COLNUM COLINC &OPTIONAL (STREAM *STANDARD-OUTPUT*)
LOGICAL-BLOCK-TAB is the functional equivalent of ~T. STREAM (which
defaults to *STANDARD-OUTPUT*) follows the standard conventions for
stream arguments to printing functions. The arguments COLNUM and COLINC
correspond to the two numeric parameters to ~T and are in terms of ems.
The KIND argument specifies the style of tabbing. It must be one of
:LINE (tab using ~T), :BLOCK (tab using ~:T), :LINE-RELATIVE (tab using
~@T), or :BLOCK-RELATIVE (tab using ~:@T). If STREAM is a special
stream bound by WITHIN-LOGICAL-BLOCK, tabbing is performed. Otherwise,
LOGICAL-BLOCK-TAB has no effect. The value NIL is always returned.
LOGICAL-BLOCK-POP ARGS &OPTIONAL (STREAM *STANDARD-OUTPUT*) [Macro]
LOGICAL-BLOCK-COUNT &OPTIONAL (STREAM *STANDARD-OUTPUT*) [Macro]
LOGICAL-BLOCK-POP is identical to POP except that it supports
*PRINT-LENGTH* and *PRINT-CIRCLE*. It is an error to use
LOGICAL-BLOCK-POP anywhere other than syntactically nested within a
call on WITHIN-LOGICAL-BLOCK.
ARGS must be a symbol or expression acceptable to POP. STREAM (which
defaults to *STANDARD-OUTPUT*) follows the standard conventions for
stream arguments to printing functions. If STREAM is a special stream
bound by WITHIN-LOGICAL-BLOCK, then LOGICAL-BLOCK-POP performs the
special operations described below. Otherwise, LOGICAL-BLOCK-POP is
identical to POP.
Each time LOGICAL-BLOCK-POP is called, it performs three tests. if
ARGS is not a cons, ". " is printed followed by ARGS. If
*PRINT-LENGTH* is NIL and LOGICAL-BLOCK-POP has already been called
*PRINT-LENGTH* times within the immediately containing logical block,
"..." is printed. If *PRINT-CIRCLE* is not NIL, and ARGS is a circular
reference, then ". " is printed followed by an appropriate #n# marker.
If either of the three conditions above occurs, the special output is
printed on :STREAM and the execution of the immediately containing
WITHIN-LOGICAL-BLOCK is terminated except for the printing of the
suffix. Otherwise, LOGICAL-BLOCK-POP pops the top value off of ARGS
and returns this value.
LOGICAL-BLOCK-COUNT is identical to LOGICAL-BLOCK-POP except that it
does not take an ARGS argument, always returns NIL, and only performs
the second test discussed above. It is useful when the components of a
non-list are being printed.
Using the functions above, TABULAR-STYLE could be defined as follows.
(defun tabular-style (s list &optional (colon? T) atsign? (tabsize nil))
(declare (ignore atsign?))
(if (null tabsize) (setq tabsize 16))
(within-logical-block (s list :prefix (if colon? "(" "")
:suffix (if colon? ")" ""))
(when list
(loop (write (logical-block-pop list s) :stream s)
(if (null list) (return nil))
(write-char #\space s)
(logical-block-tab :block-relative 0 tabsize s)
(conditional-newline :fill s)))))
The function below prints a vector using #(...) notation.
(defun print-vector (v *standard-output*)
(within-logical-block (nil nil :prefix "#(" :suffix ")")
(let ((end (length v)) (i 0))
(when (plusp end)
(loop (logical-block-count)
(write (aref v i))
(if (= (incf i) end) (return nil))
(write-char #\space)
(conditional-newline :fill))))))
FILL-STYLE STREAM LIST &OPTIONAL (COLON? T) ATSIGN?
LINEAR-STYLE STREAM LIST &OPTIONAL (COLON? T) ATSIGN?
TABULAR-STYLE STREAM LIST &OPTIONAL (COLON? T) ATSIGN? (TABSIZE 16)
The directives ~/fill-style/, ~/linear-style/, and ~/tabular-style/ are
supported by the three functions above. These functions can also be
called directly by the user. Each function prints parentheses around
the output if an only if COLON? (default T) is not NIL. Each function
ignores its ATSIGN? argument and returns NIL. (These arguments are
optional to facilitate the direct use of the three functions.) Each
function handles abbreviation and circularity detection correctly, and
uses WRITE to print LIST when given a non-list argument.
The function LINEAR-STYLE prints a list either all on one line, or with
each element on a separate line. The function FILL-STYLE prints a list
with as many elements as possible on each line. The function
TABULAR-STYLE is the same as FILL-STYLE except that it prints the
elements so that they line up in columns. This function takes an
additional argument TABSIZE (default 16) that specifies the column
spacing in ems.
[End of attached document]
∂26-Jan-89 1215 CL-Cleanup-mailer Issue: PRINT-CASE-PRINT-ESCAPE-INTERACTION (Version 1)
Received: from STONY-BROOK.SCRC.Symbolics.COM (SCRC-STONY-BROOK.ARPA) by SAIL.Stanford.EDU with TCP; 26 Jan 89 12:15:37 PST
Received: from BOBOLINK.SCRC.Symbolics.COM by STONY-BROOK.SCRC.Symbolics.COM via CHAOS with CHAOS-MAIL id 527498; Thu 26-Jan-89 15:13:14 EST
Date: Thu, 26 Jan 89 15:13 EST
From: Kent M Pitman <KMP@STONY-BROOK.SCRC.Symbolics.COM>
Subject: Issue: PRINT-CASE-PRINT-ESCAPE-INTERACTION (Version 1)
To: CL-Cleanup@SAIL.Stanford.EDU
cc: KMP@STONY-BROOK.SCRC.Symbolics.COM
Message-ID: <890126151314.6.KMP@BOBOLINK.SCRC.Symbolics.COM>
Trust me. You're gonna love this one! -kmp
-----
Issue: PRINT-CASE-PRINT-ESCAPE-INTERACTION
Forum: Cleanup
References: *PRINT-ESCAPE* (pp370-371), *PRINT-CASE* (pp372), WRITE
Category: CLARIFICATION
Edit history: 26-Jan-89, Version 1 by Pitman
Status: For Internal Discussion
Problem Description:
The wording on page 372 of CLtL uses fuzzy terms that make it hard
to tell if *PRINT-ESCAPE* interacts with *PRINT-CASE*.
Paragraph 1 of the description of *PRINT-CASE* says "This variable
controls the case (upper, lower, or mixed) in which to print any
uppercase characters in the names of symbols when vertical-bar
syntax is used."
Paragraph 2 begins with the seemingly unambiguous statement "Lowercase
characters in the internal print name are always printed in lowercase"
but then goes on to muddy the water by saying "and are preceded by
a single escape character or enclosed by multiple escape characters".
This escaping presumably only happens in *PRINT-ESCAPE* T mode, which
leads one to wonder if other parts of the *PRINT-ESCAPE* description are
implicitly controlled by *PRINT-ESCAPE* as well.
The function WRITE is affected by implication.
Proposal (PRINT-CASE-PRINT-ESCAPE-INTERACTION:LIKE-PRIN1):
Define that *PRINT-CASE* cases characters the same as PRIN1 would.
Proposal (PRINT-CASE-PRINT-ESCAPE-INTERACTION:LIKE-WRITE-STRING):
Define that *PRINT-CASE* has an effect only when *PRINT-ESCAPE* is T.
When *PRINT-CASE* is NIL, WRITE-STRING is used directly.
Proposal (PRINT-CASE-PRINT-ESCAPE-INTERACTION:VERTICAL-BAR-RULE-NO-UPCASE):
Define that *PRINT-CASE* has an effect at all times when *PRINT-ESCAPE*
is NIL. Define that *PRINT-CASE* also has an effect when *PRINT-ESCAPE*
is T unless inside an escape context (i.e., unless between vertical bars
or after a slash). Under no circumstance is any character which was
lowercase in the internal representation ever presented as uppercase
due to *PRINT-CASE*.
Proposal (PRINT-CASE-PRINT-ESCAPE-INTERACTION:VERTICAL-BAR-RULE-PERMIT-UPCASE):
Like VERTICAL-BAR-RULE-NO-UPCASE, but permit *PRINT-CASE* to upcase
lowercase characters in the *PRINT-ESCAPE* NIL case since preservation of
Lisp syntax is not important there.
Proposal (PRINT-CASE-PRINT-ESCAPE-INTERACTION:EXPLICITLY-VAGUE):
Specify that the effect of *PRINT-CASE* when *PRINT-ESCAPE* is NIL
is implementation-dependent.
Test Case:
(LET ((RESULT '()) (TABWIDTH 12))
(DOLIST (SYMBOL '(|x| |FoObAr| |fOo|))
(LET ((TAB -1))
(FORMAT T "~&")
(DOLIST (ESCAPE '(T NIL))
(DOLIST (CASE '(:UPCASE :DOWNCASE :CAPITALIZE))
(FORMAT T "~VT" (* (INCF TAB) TABWIDTH))
(WRITE SYMBOL :ESCAPE ESCAPE :CASE CASE))))))
For each of the following, two clusters of output is shown. The first is
how an implementation which leans heavily on vertical bars might work.
The second is how an implementation which leans heavily on slash might
work. In fact, other interpretations are possible (mixing slash and
vertical bar). These examples are not an exhaustive analysis of the
implications of each proposal.
Possible outputs under LIKE-PRIN1:
|x| |x| |x| x x x
|FoObAr| |FoObAr| |FoObAr| FoObAr FoObAr FoObAr
|fOo| |fOo| |fOo| fOo fOo fOo
\x \x \x x x x
F\oO\bA\r f\oo\ba\r F\oo\ba\r FoObAr foobar Foobar
\fO\o \fo\o \fo\o fOo foo foo
Possible output under LIKE-WRITE-STRING:
|x| |x| |x| x x x
|FoObAr| |FoObAr| |FoObAr| FoObAr FoObAr FoObAr
|fOo| |fOo| |fOo| fOo fOo fOo
\x \x \x x x x
F\oO\bA\r f\oo\ba\r F\oo\ba\r FoObAr FoObAr FoObAr
\fO\o \fo\o \fo\o fOo fOo fOo
Possible output under VERTICAL-BAR-RULE-NO-UPCASE:
|x| |x| |x| x x x
|FoObAr| |FoObAr| |FoObAr| FoObAr foobar Foobar
|fOo| |fOo| |fOo| fOo foo foo
\x \x \x x x x
F\oO\bA\r f\oo\ba\r F\oo\ba\r FoObAr foobar Foobar
\fO\o \fo\o \fo\o fOo foo foo
Possible output under VERTICAL-BAR-RULE-PERMIT-UPCASE:
|x| |x| |x| X x X
|FoObAr| |FoObAr| |FoObAr| FOOBAR foobar Foobar
|fOo| |fOo| |fOo| FOO foo Foo
\x \x \x X x X
F\oO\bA\r f\oo\ba\r F\oo\ba\r FOOBAR foobar Foobar
\fO\o \fO\o \fO\o FOO foo Foo
Rationale:
It's silly for implementations to vary on this point.
Current Practice:
A strict reading of CLtL suggests that probably VERTICAL-BAR-RULE-NO-UPCASE
is the most correct.
Symbolics Genera doesn't do any of these. It was trying to do
VERTICAL-BAR-NO-UPCASE, but it had a bug which was about to be fixed when
this issue was raised.
Cost to Implementors:
Probably trivial.
Cost to Users:
Negligible in most cases. Probably very few users depend on this.
Those who do depend on it probably do so because they think the
behavior doesn't vary and probably don't get the portable behavior they
expect.
Cost of Non-Adoption:
Gratuitous variation between implementations.
Benefits:
Cost of non-adoption is avoided.
Aesthetics:
Anything that makes the language tighter probably improves aesthetics.
Only VERTICAL-BAR-RULE-PERMIT-UPCASE and LIKE-WRITE-STRING have really
useful behaviors in the :ESCAPE NIL situation. Of these, perhaps only
VERTICAL-BAR-RULE-PERMIT-UPCASE is really visually pleasant.
Discussion:
Pitman doesn't think the particular choice is very important. He just
wants the issue to be resolved. His slight preference is for
VERTICAL-BAR-RULE-PERMIT-UPCASE, then LIKE-WRITE-STRING, then either
of LIKE-PRIN1 or VERTICAL-BAR-RULE-NO-UPCASE. He sees no reason to go
with EXPLICITLY-VAGUE unless we deadlock.
Michael Greenwald, who raised the issue at Symbolics, doesn't have
a preference either but believes that CLtL (perhaps unintentionally)
leans toward VERTICAL-BAR-RULE-NO-UPCASE.
∂25-Mar-89 2231 X3J13-mailer **DRAFT** Issue: READ-CASE-SENSITIVITY (Version 2)
Received: from STONY-BROOK.SCRC.Symbolics.COM (SCRC-STONY-BROOK.ARPA) by SAIL.Stanford.EDU with TCP; 25 Mar 89 22:31:39 PST
Received: from BOBOLINK.SCRC.Symbolics.COM by STONY-BROOK.SCRC.Symbolics.COM via CHAOS with CHAOS-MAIL id 565470; Sun 26-Mar-89 01:31:28 EST
Date: Sun, 26 Mar 89 01:30 EST
From: Kent M Pitman <KMP@STONY-BROOK.SCRC.Symbolics.COM>
Subject: **DRAFT** Issue: READ-CASE-SENSITIVITY (Version 2)
To: X3J13@SAIL.Stanford.EDU
Message-ID: <890326013058.8.KMP@BOBOLINK.SCRC.Symbolics.COM>
>>> PLEASE DO -NOT- REPLY TO THIS ISSUE <<<
This is just for the reference of anyone still reading mail this close
to the meeting, so that you will have seen it in case it (or some further
revision) comes up at the meeting. If you have any comments, just bring
them to the meeting. Thanks. -kmp
-----
Issue: READ-CASE-SENSITIVITY
Forum: Cleanup
References: CLtL p 334 ff: What the Read Function Accepts,
especially p 337, step 8, point 1.
CLtL p 360 ff: The Readtable
COPY-READTABLE (CLtL, p 361)
*PRINT-CASE* (CLtL, p 372)
Category: ADDITION/CHANGE
Edit history: Version 1, 15-Feb-89, by Dalton
Version 2, 23-Mar-89, by Dalton,
(completely new proposal after comments from
Pitman, Gray, Masinter, and R.Tobin@uk.ac.ed)
Problem Description:
The Common Lisp reader always converts unescaped constituent
characters to upper case. (See CLtL, p 337, step 8, point 1.)
This behavior is not always desirable.
1. Lisp applications often use the Lisp reader to read their data.
This is often significantly easier than writing input routines
from scratch, especially if the input can be structured as lists.
However, certain applications want to make use of case distinctions,
and Common Lisp makes this unreasonably difficult. (You must define
every letter as a read macro and have the macro function read the
rest of the symbol, or else you must write a reader from scratch.)
2. Some programming languages distinguish between upper and lower
case in identifiers, and useful conventions are often built around
such distinctions. For example, in C, constants are often written
in upper case and variables in lower. In Mesa(?) and Smalltalk(?),
a capital letter is used to indicate the beginning of a new word
in identifiers made up of several words. In Edinburgh Prolog,
variables begin with upper-case letters and constant symbols do
not. The case-insensitivity of the Common Lisp reader makes
it difficult to use conventions of this sort.
Proposal (READ-CASE-SENSITIVITY:READTABLE-KEYWORDS)
Define a new settable function, (READTABLE-CASE <readtable>) to
control the reader's interpretation of case. The following values
may be given:
:UPCASE -- convert unescaped characters to upper-case, as now.
:DOWNCASE -- convert unescaped characters to lower-case.
:PRESERVE -- don't convert, leaving lower-case letters in lower
case and upper-case characters in upper case.
:INVERT -- convert lower-case to upper and upper-case to lower.
COPY-READTABLE copies the setting of READTABLE-CASE. The value of
READTABLE-CASE for the standard readtable is :UPCASE.
The READTABLE-CASE of a readtable also has significance when
printing. The case in which letters are printed is determined as
follows:
When READ-CASE is :UPCASE, upper-case letters are printed in the
case specified by *PRINT-CASE*.
When READ-CASE is :DOWNCASE, lower-case letters are printed in
the case specified by *PRINT-CASE*.
When READ-CASE is :PRESERVE, letters are printed in their own
case.
When READ-CASE is :INVERT, the case of all letters is inverted.
(The behavior when *PRINT-CASE* is :CAPITALIZE is like :UPCASE for
the first character and :DOWNCASE for the rest.)
The rules for escaping letters are also affected by the READTABLE-CASE.
If *PRINT-ESCAPE* is true, letters are escaped as follows:
When READ-CASE is :UPCASE, all lower-case letters must be escaped.
When READ-CASE is :DOWNCASE, all upper-case letters must be escaped.
Otherwise, no letters need be escaped.
Proposal (READ-CASE-SENSITIVITY:READTABLE-FUNCTION)
Define a new settable function (READTABLE-CHARACTER-TRANSLATION
<readtable>) to control the reader's interpretation of unescaped
constituent characters. The value may be any function of type
(FUNCTION (CHARACTER) CHARACTER). Where the reader now converts
such characters to upper case it should instead call the function
that is the value of READTABLE-CHARACTER-TRANSLATION for the current
readtable. (See CLtL, page 337, step 8, point 1.)
COPY-READTABLE copies the setting of READTABLE-CHARACTER-TRANSLATION.
The value for the standard readtable is CHAR-UPCASE.
The READTABLE-CHARACTER-TRANSLATION of a readtable also has
significance when printing. The reader recognizes certain functions
which control the reader's interpretation of case and alters its
behavior accordingly. This behavior is given by the following
correspondence between functions and the keywords described above.
[This is just to avoid repeating a lot of text.]
function keyword
CHAR-UPCASE :UPCASE
CHAR-DOWNCASE :DOWNCASE
IDENTITY :PRESERVE
CHAR-INVERT-CASE :INVERT
The function can be given either as a symbol or as one of the values
#'CHAR-UPCASE, #'CHAR-DOWNCASE, #'IDENTITY, #'CHAR-INVERT-CASE.
If the READTABLE-CHARACTER-TRANSLATION is not one of the functions
listed above, letters are always printed in their own case (in
particular, *PRINT-CASE* has no effect), and all characters in
symbol names are escaped if *PRINT-ESCAPE* is true.
Define a new function CHAR-INVERT-CASE of type (FUNCTION (CHARACTER)
CHARACTER) analogous to CHAR-UPCASE and CHAR-DOWNCASE. It attempts
to convert its argument to upper-case if the argument is lower-case
and to lower-case if the argument is upper-case.
Rationale:
There are a number of different ways to achieve case-sensitivity.
These proposals are fairly simple but provide all of the
functionality that one could reasonably expect.
By using a property of the readtable, we avoid introducing a new
special variable. Any code that wishes to control all of the
reader's parameters already takes *READTABLE* into account. A new
special variable would require such code to change.
:DOWNCASE is included for symmetry with :UPCASE. :INVERT is
included so that case conventions could be used in Common Lisp code
without requiring that the names symbols in the "LISP" package be
written in upper case. (Opinions vary as to whether is is advisable
to use such conventions, but this proposal leaves that choice to the
user.)
In order to avoid complex interactions between the case setting of
the readtable and *PRINT-CASE*, this proposal specifies a
significance for *PRINT-CASE* only when the case setting is :UPCASE
or :DOWNCASE. The meaning of *PRINT-CASE* when the readtable
setting is :DOWNCASE was chosen for its simplicity and for symmetry
with :UPCASE while still being useful.
Test Case:
;; keyword version
(let ((rt (copy-readtable nil)))
(mapcar
#'(lambda (case)
(setf (readtable-case rt) case)
(read-from-string "Zebra"))
'(:upcase :downcase :preserve :invert)))
=> (ZEBRA |zebra| |Zebra| |zEBRA|) ;as printed with the standard
;readtable and *print-case* :upcase
Current Practice:
While there may not be any current implementation that supports
exactly this proposal, several implementations provide some means
for changing case sensitivity.
Franz Inc's ExCL has a function, EXCL:SET-CASE-MODE, that sets both
the "preferred case" (the case of character in the print names of
standard symbols such as CAR) and whether or not the reader is case-
sensitive.
In Symbolics Common Lisp, the function SET-CHARACTER-TRANSLATION
can be used to make the translation of a letter be that same letter,
thus achieving case-sensitivity.
Xerox Medley has a function for setting a readtable flag that
determines case sensitivity.
Cost to Implementors:
Fairly small. The reader will be slightly slower and readtables
will be slightly more complex.
Cost to Users:
Slight. Programmers must already take into account the possibility
that *READTABLE* will be a non-standard readtable. Case-sensitivity
is no worse than character macros in this respect.
Cost of Non-Adoption:
Applications that want to read mixed-case expressions will not
be able to use the Common Lisp reader to do so (except, perhaps,
by tortuous use of read macros).
Programming styles that rely on case distinctions (without escape
characters) will be effectively impossible in Common Lisp.
Benefits:
Applications will be able to read mixed-case expressions.
Programmers will be able to make use of case distinctions.
Aesthetics:
For the proposals:
The language will have greater symmetry, because it will be
possible to control the treatment of case on both input and output
instead of only on output (as is now the case).
The language will look less old-fashioned.
Against the proposals:
It is, perhaps, inconsistent to control case-sensitivity by a
readtable operation when other aspects of the reader, such as the
input base and the default float format (not to mention the
package), are controlled by special variables. However, it can be
argued that character-level syntax is determined chiefly by the
readtable. Case-sensitivity can be seen as analogous to character
macros in this respect.
Keywords vs function
The keyword proposal is somewhat simpler and avoids raising the
possibility of character translation that applies in general and
not just for unescaped constituents.
The function proposal is perhaps more elegant.
Discussion:
Dalton supports both proposals but slightly prefers READTABLE-CASE.
Version 1 of the proposal suggested a new global variable rather
than a property of the readtable. Pitman was strongly opposed to
that proposal and gave convincing arguments that it should be
dropped. Gray suggested that the readtable property should be a
function.
∂22-Mar-89 0931 X3J13-mailer Issue: SETF-MULTIPLE-STORE-VARIABLES (Version 2)
Received: from STONY-BROOK.SCRC.Symbolics.COM (SCRC-STONY-BROOK.ARPA) by SAIL.Stanford.EDU with TCP; 22 Mar 89 09:31:05 PST
Received: from EUPHRATES.SCRC.Symbolics.COM by STONY-BROOK.SCRC.Symbolics.COM via CHAOS with CHAOS-MAIL id 562950; Wed 22-Mar-89 12:30:11 EST
Date: Wed, 22 Mar 89 12:29 EST
From: David A. Moon <Moon@STONY-BROOK.SCRC.Symbolics.COM>
Subject: Issue: SETF-MULTIPLE-STORE-VARIABLES (Version 2)
To: X3J13@SAIL.STANFORD.EDU
cc: Pavel.pa@XEROX.COM, GSB@STONY-BROOK.SCRC.Symbolics.COM
Message-ID: <19890322172956.1.MOON@EUPHRATES.SCRC.Symbolics.COM>
Line-fold: No
This proposal didn't quite make it to the January meeting, due to
unclear responsibilities for who was supposed to update it from the
discussion. I have filled the gap and made the changes implied by the
discussion back in December of last year. We can't vote on this if
someone invokes the two-week rule, but perhaps no one will.
Issue: SETF-MULTIPLE-STORE-VARIABLES
References: CLtL, pp.93-107
Lisp Pointers, v2n2, pp.27-41
Category: ADDITION
Edit history: Version 1, 5-Dec-88, Pavel
Version 2, 22-Mar-89, Moon, simplify, update from discussion
Problem description:
The description of GET-SETF-METHOD-MULTIPLE-VALUE on page 107 of CLtL
states that there are no cases in Common Lisp that allow multiple values
to be stored into a generalized variable. This is seen by some as an
arbitrary decision in light of the fact that a very reasonable semantics
exists for multiple values being assigned by several Common Lisp macros,
including SETF. The rationale on page 103 of CLtL suggests that this
decision might be changed in the future.
Proposal (SETF-MULTIPLE-STORE-VARIABLES:ALLOW):
Extend the semantics of the macros SETF, PSETF, SHIFTF, ROTATEF, and
ASSERT to allow "places" whose SETF methods have more than one "store
variable". In such cases, the macros accept as many values from the
newvalue form as there are store variables. As usual, extra values
are ignored and missing values default to NIL.
Extend the long form of DEFSETF to allow the specification of more
than one "store variable", with the obvious semantics.
Clarify that GET-SETF-METHOD signals an error if there would be more
than one store-variable.
Test Cases/Examples:
(defstruct region width height)
(defun region-size (region)
(values
(region-width region)
(region-height region)))
(defsetf region-size (region) (width height)
`(values
(setf (region-width ,region) ,width)
(setf (region-height ,region) ,height)))
(setf my-reg (make-region :width 10 :height 20))
=> #S(REGION :WIDTH 10 :HEIGHT 20)
(region-size my-reg)
=> 10
20
(setf (region-size my-reg) (values 30 40))
=> 30
40
(region-size my-reg)
=> 30
40
Rationale:
This change removes an artificial restriction on the semantics of
several Common Lisp macros, allowing a broader set of contexts in
which generalized variables can be used. For example, it is not
difficult to write a reasonable SETF method for the VALUES function,
yielding a powerful MULTIPLE-VALUE-SETF form:
(setf (values (car a) (gethash b 'c) (aref d 13))
(some-hairy-computation))
In the language as currently defined, this example would have to be
written:
(multiple-value-bind (x y z)
(some-hairy-computation)
(setf (car a) x
(gethash b 'c) y
(aref d 13) z))
Many other (perhaps more compelling) examples of generalized variables
holding more than one value can easily be imagined. Their use,
however, is severely discouraged by Common Lisp as defined in CLtL,
since none of the built-in macros will accept them.
The clarification of GET-SETF-METHOD makes explicit what is implied
by CLtL (CLtL uses the word "guarantee", whose relationship to
signalling of errors is unclear).
Current practice:
I do not know of any implementations that allow all of this extension.
Xerox Lisp does not signal an error, but this is probably due to a bug
in GET-SETF-METHOD. Lucid signals an error in GET-SETF-METHOD.
Symbolics Genera supports the proposal in SETF and PSETF, but not in
SHIFTF, ROTATEF, and ASSERT.
Cost to Implementors:
A relatively minor fix to each of the affected macros suffices. For
example, to fix SETF itself, one need only call
GET-SETF-METHOD-MULTIPLE-VALUE instead of GET-SETF-METHOD and emit a
MULTIPLE-VALUE-BIND instead of a LET for binding the store variables.
Cost to Users:
This is an upward-compatible change; no user code must change.
Cost of non-adoption:
Yet another non-uniformity in the language, yet another piece of
mechanism without a clear use (GET-SETF-METHOD-MULTIPLE-VALUE).
Benefits:
Wider applicability of a reasonably nice abstraction, the removal of
an artificial prohibition.
Aesthetics:
People may disagree about whether this is a simplification or not. I
am firmly on the side that believes that such removal of
non-uniformities is a simplifying force in the language.
Discussion:
Pavel supports this proposal.
Moon supports this proposal except he is not sure about the
inclusion of ASSERT.
GSB suggests that this is a clarification rather than an addition,
because the lack of any predefined setf-methods that use multiple
store variables should not mean that SETF, etc. should not work with
such a setf-method if the user defined one. The problem is that CLtL
examples such as the ones for SHIFTF on p.98 and the simplified
definition for SETF on p.107 contradict this proposal, and might have
been taken as specifications, rather than simplified examples, by
some readers.
Predefined SETF methods for such functions as VALUES, CONS, and VECTOR
could have been proposed, but we refrained. This proposal is necessary
to allow the user to write such methods for himself, but if this
proposal is adopted those setf-methods are very easy to write in
a portable fashion.
∂22-Jun-89 1251 X3J13-mailer issue SYNTACTIC-ENVIRONMENT-ACCESS, version 10
Received: from cs.utah.edu by SAIL.Stanford.EDU with TCP; 22 Jun 89 12:50:45 PDT
Received: from defun.utah.edu by cs.utah.edu (5.61/utah-2.1-cs)
id AA28295; Thu, 22 Jun 89 13:50:51 -0600
Received: by defun.utah.edu (5.61/utah-2.0-leaf)
id AA02831; Thu, 22 Jun 89 13:50:47 -0600
Date: Thu, 22 Jun 89 13:50:47 -0600
From: sandra%defun@cs.utah.edu (Sandra J Loosemore)
Message-Id: <8906221950.AA02831@defun.utah.edu>
To: x3j13@sail.stanford.edu
Reply-To: cl-compiler@sail.stanford.edu
Subject: issue SYNTACTIC-ENVIRONMENT-ACCESS, version 10
This is a revised version of an issue that was distributed last week.
Besides fixing a few more typos, I have added some clarifications,
mostly relating to the interaction between DEFDECLARE and normal
compiler/interpreter processing of DECLARE, and resolution of
potential conflicts between declaration specifiers defined with
DEFDECLARE and those defined in the standard or by the implementation.
Forum: Compiler
Issue: SYNTACTIC-ENVIRONMENT-ACCESS
References: CLtL Chapter 8: Macros,
CLtL Chapter 9: Declarations,
Issue COMPILE-FILE-ENVIRONMENT,
Issue DEFINING-MACROS-NON-TOP-LEVEL,
Issue DESTRUCTURING-BIND,
Issue EVAL-WHEN-NON-TOP-LEVEL,
Issue GET-SETF-METHOD-ENVIRONMENT,
Issue FUNCTION-NAME,
Issue FUNCTION-TYPE,
Issue MACRO-ENVIRONMENT-EXTENT,
Issue MACRO-FUNCTION-ENVIRONMENT,
Issue PROCLAIM-LEXICAL,
Issue PACKAGE-CLUTTER
Category: ADDITION
Edit history: Version 1, 2-Oct-88, Eric Benson
Version 2, 17-Feb-89, Kim A. Barrett
Version 3, 9-Mar-89, Kim A. Barrett (respond to comments)
Version 4, 12-Mar-89, Sandra Loosemore (more revisions)
Version 5, 20-Mar-89, Sandra Loosemore (only proposal SMALL)
Version 6, 23-Mar-89, Sandra Loosemore (more revisions)
Version 7, 7-Apr-89, Moon & Barrett (more revisions)
Version 8, 9-Jun-89, Kim A. Barrett (add DEFDECLARE)
Version 9, 13-Jun-89, Moon (small corrections)
Version 10, 22-Jun-89, Loosemore (more clarifications,
primarily relating to DEFDECLARE)
Status: Ready for release
Problem description:
When macro forms are expanded, the expansion function is called with two
arguments: the form to be expanded, and the environment in which the form was
found. The environment argument is of limited utility. The only use
sanctioned currently is as an argument to MACROEXPAND or MACROEXPAND-1 or
passed directly as an argument to another macro expansion function. Recently
passed cleanup issues allow it as an argument to MACRO-FUNCTION and to
GET-SETF-METHOD.
It is very difficult to write a code walker that can correctly handle local
macro and function definitions, due to insufficient access to the information
contained in environments and the inability to augment environments with local
definitions.
Proposal (SYNTACTIC-ENVIRONMENT-ACCESS:SMALL):
The following functions provide information about syntactic environment
objects. In all of these functions the argument named ENV is an environment
of the sort received by the &ENVIRONMENT argument to a macro or as the
environment argument for EVALHOOK. (It is not required that implementations
provide a distinguished representation for such objects.) Optional "env"
arguments default to NIL, which represents the local null lexical environment
(containing only global definitions and proclamations that are present in the
runtime environment). All of these functions should signal an error of type
TYPE-ERROR if the value of an environment argument is not a syntactic
environment.
The accessors VARIABLE-INFORMATION, FUNCTION-INFORMATION, and
DECLARATION-INFORMATION retrieve information about declarations that are in
effect in the environment. Since implementations are permitted to ignore
declarations (except for SPECIAL declarations and OPTIMIZE SAFETY
declarations if they ever compile unsafe code), these accessors are required
only to return information about declarations that were explicitly added to
the environment using AUGMENT-ENVIRONMENT. They might also return
information about declarations recognized and added to the environment by the
interpreter or the compiler, but that is optional at the discretion of the
implementer. Implementations are also permitted to canonicalize
declarations, so the information returned by the accessors might not be
identical to the information that was passed to AUGMENT-ENVIRONMENT.
VARIABLE-INFORMATION variable &optional env [Function]
This function returns information about the interpretation of the symbol
VARIABLE when it appears as a variable within the lexical environment ENV.
The following three values are returned.
The first value indicates the type of definition or binding which is apparent
in ENV:
NIL There is no apparent definition or binding for variable.
:SPECIAL VARIABLE refers to a special variable, either declared
or proclaimed.
:LEXICAL VARIABLE refers to a lexical variable.
:SYMBOL-MACRO VARIABLE refers to a SYMBOL-MACROLET binding.
:CONSTANT VARIABLE refers to a named constant, defined by
DEFCONSTANT, or VARIABLE is a keyword symbol.
[Note: If issue PROCLAIM-LEXICAL passes, then the :LEXICAL result will also
refer to variables proclaimed lexical.]
The second value indicates whether there is a local binding of the name. If
the name is locally bound, the second value is true. Otherwise, NIL is
returned.
The third value is a property list containing information about declarations
that apply to the apparent binding of the variable. The keys in the property
list are symbols which name declaration-specifiers, and the format of the
corresponding values depends on the particular declaration-specifier
involved. The standard declaration-specifiers that might appear as keys in
this property list are:
DYNAMIC-EXTENT a non-NIL value indicates that the variable has been
declared DYNAMIC-EXTENT. If the value is NIL, the property
might be omitted.
IGNORE a non-NIL value indicates that the variable has been declared
IGNORE. If the value is NIL, the property might be omitted.
TYPE a type specifier associated with the variable by a TYPE
declaration or an abbreviated declaration such as (FIXNUM VAR).
If no explicit association exists, either by PROCLAIM or
DECLARE, then the type specifier is T. It is permissible for
implementations to use a type specifier that is equivalent
to or a supertype of the one appearing in the original
declaration. If the value is T, the property might be
omitted.
If an implementation supports additional declaration-specifiers that
apply to variable bindings, those declaration-specifiers might also
appear in the property list. However, the corresponding key must not
be a symbol that is external in any package defined in the standard
or that is otherwise accessible in the USER package.
The property list might contain multiple entries for a given
property. The consequences of destructively modifying the list
structure of this property list or its elements (except for values that
appear in the property list as a result of DEFDECLARE) are undefined.
Programmers are reminded that the global binding might differ from the
local one, and can be retrieved by calling VARIABLE-INFORMATION
again with a null lexical environment.
FUNCTION-INFORMATION function &optional env [Function]
This function returns information about the interpretation of the function
name FUNCTION when it appears in a functional position within lexical
environment ENV. The following three values are returned.
The first value indicates the type of definition or binding of the function
name which is apparent in ENV:
NIL There is no apparent definition for FUNCTION.
:FUNCTION FUNCTION refers to a function.
:MACRO FUNCTION refers to a macro.
:SPECIAL-FORM FUNCTION refers to a special form.
Some function names can refer to both a global macro and a global special
form. In such a case, the macro takes precedence, and :MACRO is returned as
the first value.
The second value specifies whether the definition is local or global. If
local, the second value is true, and it is false when the definition is
global.
The third value is a property list containing information about declarations
that apply to the apparent binding of the function. The keys in the property
list are symbols which name declaration-specifiers, and the format of the
corresponding values depends on the particular declaration-specifier
involved. The standard declaration-specifiers that might appear as keys in
this property list are:
INLINE one of the symbols INLINE, NOTINLINE, or NIL to indicate
whether the function name has been declared INLINE, has been
declared NOTINLINE, or neither. If the value is NIL, the
property might be omitted.
FTYPE the type specifier associated with the function name in the
environment, or the symbol FUNCTION if there is no functional
type declaration or proclamation associated with the function
name. This value might not include all the apparent FTYPE
declarations for the function name. It is permissible for
implementations to use a type specifier that is equivalent
to or a supertype of the one that appeared in the original
declaration. If the value is FUNCTION, the property might be
omitted.
If an implementation supports additional declaration-specifiers that
apply to function bindings, those declaration-specifiers might also
appear in the property list. However, the corresponding key must not be
a symbol that is external in any package defined in the standard or
that is otherwise accessible in the USER package.
The property list might contain multiple entries for a given
property. In this case the value associated with the first entry has
precedence. The consequences of destructively modifying the list
structure of this property list or its elements (except for values
that appear in the property list as a result of DEFDECLARE) are
undefined.
[If issue DYNAMIC-EXTENT-FUNCTION passes, the property DYNAMIC-EXTENT will
be added to the above table.]
Programmers are reminded that the global binding might differ from the local
one, and can be retrieved by calling FUNCTION-INFORMATION again with a null
lexical environment.
DECLARATION-INFORMATION decl-name &optional env [Function]
This function returns information about declarations named by the
symbol DECL-NAME that are in force in the environment ENV.
Only declarations that do not apply to function or variable bindings
(i.e., those that are "pervasive") can be accessed with this function.
The format of the information that is returned depends on the DECL-NAME
involved.
It is required that this function recognize OPTIMIZE and DECLARATION as
DECL-NAMEs. The values returned for these two cases are as follows:
OPTIMIZE a list whose entries are of the form (quality value), where
"quality" is one of the optimization qualities defined by the
standard (SPEED, SAFETY, COMPILATION-SPEED, SPACE, and DEBUG)
or some implementation-specific optimization quality, and
"value" is an integer in the range 0 to 3. The returned list
always contains an entry for each of the standard qualities and
for each of the implementation-specific qualities. In the
absence of any previous declarations, the associated values are
implementation-dependent. The list might contain multiple
entries for a quality, in which case the first such entry
specifies the current value.
The consequences of destructively modifying this list or
its elements are undefined.
DECLARATION a list of the declaration names which have been proclaimed as
valid through the use of the DECLARATION proclamation.
The consequences of destructively modifying this list or
its elements are undefined.
If an implementation has been extended to recognize additional
pervasive declaration specifiers in DECLARE or PROCLAIM, it is required that
either the DECLARATION-INFORMATION function should also recognize those
declarations, or that the implementation provide an accessor that is
specialized for that declaration specifier. If DECLARATION-INFORMATION
is used to return the information, the corresponding DECL-NAME must not
be a symbol that is external in any package defined in the standard or
that is otherwise accessible in the USER package.
AUGMENT-ENVIRONMENT env &KEY variable
symbol-macro
function
macro
declare [Function]
This function returns a new environment containing the information present in
ENV, augmented with the information provided by the keyword arguments. It is
intended to be used by program analyzers that perform a code walk.
The arguments are supplied as follows:
:VARIABLE A list of symbols which shall be visible as bound variables in
the new environment. Whether each binding is to be interpreted
as special or lexical depends on SPECIAL declarations recorded
in the environment or provided in the :DECLARE argument list.
:SYMBOL-MACRO A list of symbol macro definitions, specified as a list of
(name definition) lists (that is, in the same format as the
CADR of a SYMBOL-MACROLET special form). The new environment
will have local symbol-macro bindings of each symbol to the
corresponding expansion, so that MACROEXPAND will be able to
expand them properly. A type declaration in the :DECLARE
argument which refers to a name in this list implicitly
modifies the definition associated with the name. The effect
is to wrap a THE form mentioning the type around the
definition.
:FUNCTION A list of function names which shall be visible as local
function bindings in the new environment.
:MACRO A list of local macro definitions, specified as a list of (name
definition) lists. Each definition must be a function of two
arguments (a form and an environment). The new environment
will have local macro bindings of each name to the
corresponding expander function, which will be returned by
MACRO-FUNCTION and used by MACROEXPAND.
:DECLARE A list of decl-specs. Information about these declarations can
be retrieved from the resulting environment using the
VARIABLE-INFORMATION, FUNCTION-INFORMATION, and
DECLARATION-INFORMATION accessors.
An error is signalled if any of the symbols naming macros in the
:SYMBOL-MACRO alist are also included in the :VARIABLE list. An error is
signaled if any of the symbols naming macros in the :SYMBOL-MACRO alist are
also included in a SPECIAL decl-spec in the :DECLARE argument. An error is
signalled if any of the names of macros in the :MACRO alist are also included
in the :FUNCTION list. The consequences of destructively modifying the list
structure of any of the arguments to this function are undefined.
The condition type of each of these errors is PROGRAM-ERROR.
The extent of the returned environment is the same as the extent of the
argument environment. The result might share structure with the argument
environment, but the argument environment is not modified.
While an environment argument from EVALHOOK is permitted to be used as the
environment argument for this function, the reverse is not true. If an
attempt is made to use the result of AUGMENT-ENVIRONMENT as the environment
argument for EVALHOOK, the consequences are undefined. The environment
returned by AUGMENT-ENVIRONMENT can only be used for syntactic analysis, ie.
the functions specified by this proposal and functions such as MACROEXPAND.
DEFDECLARE decl-name lambda-list &body body [Macro]
Define a handler for the named declaration. This is the mechanism by which
AUGMENT-ENVIRONMENT is extended to support additional declaration
specifiers. The function defined by this macro will be called with two
arguments, a decl-spec whose CAR is decl-name, and the ENV argument to
AUGMENT-ENVIRONMENT. Two values must be returned by the function. The
first value must be one of the following keywords:
:VARIABLE the declaration applies to variable bindings.
:FUNCTION the declaration applies to function bindings.
:DECLARE the declaration is pervasive, rather than applying to bindings.
For the case where the first value indicates the declaration applies to
bindings, the second value is a list, the elements of which are lists of the
form (binding-name property value). If the corresponding information
function (either VARIABLE-INFORMATION or FUNCTION-INFORMATION) is applied to
the binding name and the augmented environment, the property list which is
the third value returned by the information function will contain the value
under the specified property.
When the first value is :DECLARE, the second value is a cons whose CAR is a
property and whose CDR is the associated value. The function
DECLARATION-INFORMATION, when applied to the property and the augmented
environment, will return the associated value.
DEFDECLARE causes DECL-NAME to be proclaimed to be a declaration (it is as
if its expansion included a call (PROCLAIM '(DECLARATION decl-name))).
As is the case with standard declaration specifiers, the evaluator and
compiler are permitted, but not required, to add information about
declaration specifiers defined with DEFDECLARE to the macroexpansion and
evalhook environments.
The consequences are undefined if DECL-NAME is a symbol which can
appear as the CAR of any declaration specifier defined in the standard.
The consequences are also undefined if the return value from a
declaration handler defined with DEFDECLARE includes a property name
that is used by the corresponding accessor to return information about
any declaration specifier defined in the standard. (For example, if
the first return value from the handler is :VARIABLE, the second return
value may not use the symbols DYNAMIC-EXTENT, IGNORE, or TYPE as property
names.)
The DEFDECLARE macro does not have any special compile-time
side-effects.
PARSE-MACRO name lambda-list body &optional env [Function]
This function is used to process a macro definition in the same way
as DEFMACRO and MACROLET. It returns a lambda-expression that accepts
two arguments (a form and an environment). The "name", "lambda-list",
and "body" arguments correspond to the parts of a DEFMACRO or MACROLET
definition.
The "lambda-list" argument can include &ENVIRONMENT and &WHOLE. The "name"
argument is used to enclose the "body" in an implicit BLOCK, and might also
be used for implementation-dependent purposes (such as including the name of
the macro in error messages if the form does not match the lambda-list).
ENCLOSE lambda-expression &optional env [Function]
This function returns an object of type FUNCTION that is equivalent to what
would be obtained by evaluating `(FUNCTION ,LAMBDA-EXPRESSION) in syntactic
environment ENV. The consequences are undefined if any of the local
variable or function bindings (but not macro definitions) that are visible
in the lexical environment represented by ENV are referenced within the
LAMBDA-EXPRESSION.
Rationale:
This proposal defines a minimal set of accessors (VARIABLE-INFORMATION,
FUNCTION-INFORMATION, and DECLARATION-INFORMATION) and a constructor
(AUGMENT-ENVIRONMENT) for environments.
All of the standard declaration specifiers, with the exception of SPECIAL,
can be defined fairly easily using DEFDECLARE. It also seems to be able
to handle most extended declarations.
The PARSE-MACRO function is provided so that users don't have to write their
own code to destructure macro arguments. With the addition of
DESTRUCTURING-BIND to the language, this function is not entirely necessary.
However, it is probably worth including anyway, since any program-analyzing
program is going to need to define it, and the definition isn't completely
trivial.
ENCLOSE is needed to allow expander functions to be defined in a non-NULL
lexical environment, as required by DEFINING-MACROS-NON-TOP-LEVEL:ALLOW. It
also provides a mechanism by which the forms in an (EVAL-WHEN (COMPILE) ...)
can be executed in the enclosing environment (see Issue
EVAL-WHEN-NON-TOP-LEVEL).
Making declarations from an &ENVIRONMENT or EVALHOOK environment optional
continues to allow implementations the freedom simply to ignore all such
declarations in the compiler or interpreter.
Examples:
#1: This example illustrates the first two values returned by the function
VARIABLE-INFORMATION.
(DEFMACRO KIND-OF-VARIABLE (VAR &ENVIRONMENT ENV)
(MULTIPLE-VALUE-BIND (KIND BINDINGP)
(VARIABLE-INFORMATION VAR ENV)
`(LIST ',VAR ',KIND ',BINDINGP)))
(DEFVAR A)
(DEFCONSTANT B 43)
(DEFUN TEST ()
(LET (C)
(LET (D)
(DECLARE (SPECIAL D))
(SYMBOL-MACROLET ((E ANYTHING))
(LIST (KIND-OF-VARIABLE A)
(KIND-OF-VARIABLE B)
(KIND-OF-VARIABLE C)
(KIND-OF-VARIABLE D)
(KIND-OF-VARIABLE E)
(KIND-OF-VARIABLE F))))))
(TEST) -> ((A :SPECIAL NIL)
(B :CONSTANT NIL)
(C :LEXICAL T)
(D :SPECIAL T)
(E :SYMBOL-MACRO T)
(F NIL NIL))
#2: This example illustrates the first two values returned by the function
FUNCTION-INFORMATION.
(DEFMACRO KIND-OF-FUNCTION (FUNCTION-NAME &ENVIRONMENT ENV)
(MULTIPLE-VALUE-BIND (KIND BINDINGP)
(FUNCTION-INFORMATION FUNCTION-NAME ENV)
`(LIST ',FUNCTION-NAME ',KIND ',BINDING)))
(DEFUN A ())
(DEFUN (SETF A) (V))
(DEFMACRO B ())
(DEFUN TEST ()
(FLET ((C ()))
(MACROLET ((D ()))
(LIST (KIND-OF-FUNCTION A)
(KIND-OF-FUNCTION B)
(KIND-OF-FUNCTION QUOTE)
(KIND-OF-FUNCTION (SETF A))
(KIND-OF-FUNCTION C)
(KIND-OF-FUNCTION D)
(KIND-OF-FUNCTION E)))))
(TEST) -> ((A :FUNCTION NIL)
(B :MACRO NIL)
(QUOTE :SPECIAL-FORM NIL)
((SETF A) :FUNCTION NIL)
(C :FUNCTION T)
(D :MACRO T)
(E NIL NIL))
#3: This example shows how a code-walker might walk a MACROLET special form.
It assumes that the revised MACROLET semantics described in proposal
DEFINING-MACROS-NON-TOP-LEVEL:ALLOW are in effect.
(DEFUN WALK-MACROLET (FORM ENV)
(LET ((MACROS (MAKE-MACRO-DEFINITIONS (CADR FORM) ENV)))
(MULTIPLE-VALUE-BIND (BODY DECLS) (PARSE-BODY (CDDR FORM))
(WALK-IMPLICIT-PROGN
BODY
(AUGMENT-ENVIRONMENT ENV :MACRO MACROS :DECLARE DECLS)))))
(DEFUN MAKE-MACRO-DEFINITIONS (DEFS ENV)
(MAPCAR #'(LAMBDA (DEF)
(LET ((NAME (CAR DEF)))
(LIST NAME
(ENCLOSE (PARSE-MACRO NAME (CADR DEF) (CDDR DEF) ENV)
ENV))))
DEFS))
Cost to Implementors:
Most implementations already record some of this information in some form.
Providing these functions should not be too difficult, but it is a more than
trivial amount of work.
Cost to Users:
This change is upward compatible with user code.
Current practice:
No implementation provides all of this interface currently. Portable Common
Loops defines a subset of this functionality for its code walker and
implements it on a number of diffent versions of Common Lisp.
Discussion:
The first version of this proposal expressly did not deal with the objects
which are used as environments by EVALHOOK. This version is extended to
support them in the belief that such environments share a lot of functionality
with the syntactic environments needed by a compiler. While the two types of
environments might have very different implementations, there are many
operations which are reasonable to perform on either type, including all of
the accessor functions described by this proposal.
There has been discussion about a macro called WITH-AUGMENTED-ENVIRONMENT,
either in addition to or instead of AUGMENT-ENVIRONMENT. The point of this
would be to say that the extent of the augmented environment is the dynamic
extent of the WITH-AUGMENTED-ENVIRONMENT form. There was some concern that
there might be cases where the macro was awkward to use. Such a macro is not
included in this proposal. If AUGMENT-ENVIRONMENT is provided, then such a
macro is trivially written in terms of the function. There are places in the
processing of sequential binding forms where using such a macro might be more
difficult than using the specified function.
Some people have indicated they think that the :MACRO argument (and the
:SYMBOL-MACRO argument too?) to AUGMENT-ENVIRONMENT should be an a-list of the
form ((name . definition)...) rather than the form ((name definition)...).
Some people have indicated they think that implementations must never discard
any declarations, even if they are not otherwise used by the interpreter or
compiler. Proposal SMALL is consistent with what CLtL says (implementations
are free to ignore all declarations except SPECIAL declarations), but the
DECLARATION-INFORMATION function might not be particularly useful unless it is
guaranteed to do something. Requiring implementations to keep track of
declarations they'd otherwise ignore would involve some implementation cost
and also might incur a performance penalty.
ENCLOSE happens to subsume the extension to COERCE for converting a lambda
expression into a function (see Issue FUNCTION-TYPE, passed in June 1988).
Perhaps the extension to COERCE should be backed out?
There have been some suggestions for related functionality that have not
been included in this proposal because we haven't had the time to give
them adequate consideration, and some of them might be controversial.
These suggestions include:
- Adding a function to canonicalize type specifiers.
- Extending VARIABLE-INFORMATION to return a value indicating whether there
is a special binding of the variable in the environment, regardless of
whether or not it has been shadowed by a lexical or symbol-macro binding
of the same name.
- A function to map over all names that are defined in the lexical
environment:
MAP-ENVIRONMENT fn key &optional env
KEY must be one of the symbols :FUNCTION, :VARIABLE, or :DECLARATION.
when key is :FUNCTION,
for every symbol S for which (FUNCTION-INFORMATION s ENV)
would return the values X, true, Y, for any X and Y,
FN is applied to the arguments S, X, and Y.
when key is :VARIABLE,
for every symbol S for which (VARIABLE-INFORMATION s ENV)
would return the values X, true, Y, for any X and Y,
FN is applied to the arguments S, X, and Y.
when key is :DECLARATION,
for every symbol S for which (VARIABLE-INFORMATION s ENV)
would return a non-nil value L
FN is applied to the arguments S and L.
- Adding additional accessors and keyword arguments to AUGMENT-ENVIRONMENT
for BLOCK and TAGBODY labels.
∂11-Jan-89 2316 X3J13-mailer Issue: THE-AMBIGUITY (Version 2)
Received: from Xerox.COM by SAIL.Stanford.EDU with TCP; 11 Jan 89 23:16:17 PST
Received: from Cabernet.ms by ArpaGateway.ms ; 11 JAN 89 23:14:58 PST
Date: 11 Jan 89 23:14 PST
Sender: masinter.pa@Xerox.COM
Subject: Issue: THE-AMBIGUITY (Version 2)
To: X3J13@Sail.Stanford.Edu
Reply-to: cl-cleanup@sail.stanford.edu
From: cl-cleanup@sail.stanford.edu
cc: masinter.pa@Xerox.COM
line-fold: No
Message-ID: <890111-231458-11647@Xerox>
This issue has two proposals.
!
Forum: cleanup
Issue: THE-AMBIGUITY
References: THE (page 161)
Category: CLARIFICATION
Edit history: 21-Oct-88, version 1 by Rees
11-Jan-89, version 2 by Masinter (fix typos)
Problem description:
CLtL does not explicitly say whether the type specifier in a THE
form may be any type specifier or must be a type specifier suitable
for discrimination. Although THE is decsribed as a "declaration"
form, some CL implementations have assumed that the specifier must
be for discrimination, and disallow e.g.,
(THE (FUNCTION (T T) CONS) #'CONS)
We should either say that the implementations are right, or
explicitly say that they are wrong, since this case is easily
overlooked.
Proposal (THE-AMBIGUITY:FOR-DECLARATION):
Clarify that the type specifier in
(THE type exp)
may be any valid type specifier. In the case that exp returns one
value and type is not a VALUES type specifier, (THE type exp) is
equivalent to
(LET ((g exp))
(DECLARE (TYPE type g))
g)
where "g" is a gensym.
Proposal (THE-AMBIGUITY:FOR-DISCRIMINATION):
Clarify that the type specifier in
(THE type exp)
must be a valid type for discrimination, as for TYPEP, or it must
be of the form (VALUES type*) where type* are all valid for discrimination.
Current practice:
The Symbolics Genera and VAX LISP V2.2 interpreters signal errors for
(THE (FUNCTION (T T) CONS) #'CONS),
but this may not be intentional. CLtL would seem to allow it.
Test case:
(THE (FUNCTION (T T) CONS) #'CONS),
should return the CONS function under FOR-DECLARATION,
and should be an error under FOR-DISCRIMINATION.
Cost to implementors:
Trivial cost for THE-AMBIGUITY:FOR-DISCRIMINATION; this is a compatible
restriction.
For THE-AMBIGUITY:FOR-DECLARATION, implementations that do not
already allow arbitrary type specifiers but which want to check that
the type in a THE is satisfied would have to create an internal
version of TYPEP which could manage not to signal invalid-type-specifier
errors in those situations where TYPEP would because the type is a
declaration-only one.
Cost to users:
Users of implementations that support THE-AMBIGUITY:FOR-DECLARATION
might have to remove or change some uses of THE in their code if the
opposing alternative is adopted.
Benefits:
Either way, an ambiguity in the language specification would be clarified.
Aesthetics:
THE-AMBIGUITY:FOR-DECLARATION would seem to be more consistent with
DECLARE and with the intent of THE, which is supposed to be a way to
provide information for documentation and for the benefit of compilation.
Discussion:
Rees supports THE-AMBIGUITY:FOR-DECLARATION.
Appropriate error situation terminology must be chosen for the
situation that a THE declaration (or other declaration) is
unsatisfied, but that must be done regardless of this proposal.
This proposal would suggest that a function should be added to CL to
do the checking that THE would want to do:
(PROBABLY-TYPEP object type-spec)
[terrible name of course] returns multiple values a la SUBTYPEP: T T
if the object definitely has the type, NIL T if it definitely
doesn't, and T NIL (or NIL NIL?) otherwise. Assuming that an
interpreted THE-expression actually checks types, you could almost
define this function using the condition system and EVAL. (Ugh!)
Without PROBABLY-TYPEP, a meta-circular interpreter is more
difficult to write.
If a suitable name was found for this function, the additional
functionality could be suggested as an independent proposal, since
regardless of the outcome of this issue, the functionality is still
useful for checking DECLARE's.
Various implementation mechanisms were discussed for dealing
with THE checking.
Are there any remaining type specifiers beyond the list form
of the FUNCTION type that differ between "declaration" and
"discrimination"?
"I support FOR-DECLARATION. Lucid CL has the same bug in
the interpreter as the others (a "bug" assuming FOR-DECLARATION).
TYPEP is used to check the legality of the type specifier in THE."
In considering possible ways in which the type-checking logic
for THE and DECLARE might work, don't forget things like
(the (not (function (t t) integer)) 7),
which you would want to signal an error. I don't think this can be
done with only TYPEP and conditions.
∂11-Jan-89 2346 X3J13-mailer Issue: UNDEFINED-VARIABLES-AND-FUNCTIONS (Version 1)
Received: from Xerox.COM by SAIL.Stanford.EDU with TCP; 11 Jan 89 23:46:43 PST
Received: from Semillon.ms by ArpaGateway.ms ; 11 JAN 89 23:45:32 PST
Date: 11 Jan 89 23:45 PST
Sender: masinter.pa@Xerox.COM
Subject: Issue: UNDEFINED-VARIABLES-AND-FUNCTIONS (Version 1)
To: X3J13@Sail.Stanford.Edu
Reply-to: cl-cleanup@sail.stanford.edu
From: cl-cleanup@sail.stanford.edu
cc: masinter.pa@Xerox.COM
line-fold: No
Message-ID: <890111-234532-11838@Xerox>
It was believed that this issue might be controversial.
!
Issue: UNDEFINED-VARIABLES-AND-FUNCTIONS
References: 5.1.2 Variables (CLtL pp55-56),
Slots (88-002R, p1-10)
Category: CHANGE
Edit history: 29-Nov-88, Version 1 by Pitman
Problem Description:
CLtL does not specify what happens if you attempt to call a named function
which is in fact undefined. In most implementations, it would be devastating to
actually jump to code which you had not verified to be a function, so this error
should be easily caught -- yet, CLtL does not guarantee that an error will be
signalled even in the safest, least fast OPTIMIZE settings.
CLtL (p56) specifies that "it is an error to refer to a variable that is unbound."
CLOS (p1-10) specifies that "when an unbound slot is read, the generic function
SLOT-UNBOUND is invoked. The system-supplied primary method for SLOT-UNBOUND
signals an error."
CLOS and CLtL are not in agreement on their treatment of unbound variables.
CLtL is very weak in that it guarantees no support for reliably detecting
and signalling an error when the error situation occurs, even in the safest,
least fast OPTIMIZE setting.
CLOS is very strong in that it forces detection of the error in all
situations -- even in the fastest, least safe OPTIMIZE setting.
The disparate positions for treatment of variables and slots should be
reconciled, either by finding a compromise or by justifying the apparent
inconsistency. The final story should explain function references as well.
Proposal (UNDEFINED-VARIABLES-AND-FUNCTIONS:COMPROMISE):
Define that reading an undefined function, an unbound variable, or
an unbound slot must be detected in the highest safety setting,
but the effect is undefined in any other safety setting. That is,
- Reading an undefined function should signal an error.
- Reading an an unbound variable should signal an error.
- Reading an unbound slot should invoke the function SLOT-UNBOUND.
By ``reading an undefined function'' in the above, we mean to
include both references to the function using the FUNCTION
special form, such as F in (FUNCTION F) and references to the
function in a call, such as F in (F X).
For the case of INLINE functions (in implementations where they are
supported), it is permissible to consider that performing the inlining
constitutes the read, so that an FBOUNDP check need not be done at
execution time. Put another way, the effect of FMAKUNBOUND of a function
on potentially inlined references to that function is undefined.
Specify that the type of error signalled when an undefined function
is detected is UNDEFINED-FUNCTION, and that the NAME slot of the
UNDEFINED-FUNCTION condition is initialized to the name of the
offending function.
Specify that the type of error signalled when a unbound variable
is detected is UNBOUND-VARIABLE, and that the NAME slot of the
UNBOUND-VARIABLE condition is initialized to the name of the
offending variable.
Introduce a new condition type, UNBOUND-SLOT, which inherits from
CELL-ERROR. This new type has an additional slot, INSTANCE, which
can be initialized using the :INSTANCE keyword to MAKE-CONDITION.
Introdue a new function UNBOUND-SLOT-INSTANCE to access INSTANCE slot.
Specify that the type of error signalled by the default primary
method for the SLOT-UNBOUND generic function is UNBOUND-SLOT,
and that the NAME slot of the UNBOUND-SLOT condition is initialized
to the name of the offending variable, and that the INSTANCE slot
of the UNBOUND-SLOT condition is initialized to the offending instance.
Test Case:
(PROCLAIM '(OPTIMIZE (SAFETY 3) (SPEED 0)))
(DEFUN FOO () X)
(FOO)
>>Error: The variable X is not bound.
...
Rationale:
This makes it easier to treat slots like variables.
This makes it possible to better rely on an unbound variable error being
signalled when one has occurred.
This makes it possible to compile out useless error checking in CLOS
code where the code is debugged and the checking is redundant.
For the case of undefined functions, blindly jumping to an undefined
function is an incredibly dangerous thing to do. Every implementation
should guarantee at least some way to get error checking of undefined
functions.
Current Practice:
Until recently, Symbolics Cloe did not ever signal an error on unbound
variable, even in the safest case. The excuse was that this was a CLtL
didn't require it, but it was sometimes an impediment to debugging.
Some benchmarks for Symbolics Cloe (which currently only claims to
implement New Flavors, not CLOS) could be faster if checking for unbound
instance variables could be optimized away.
Symbolics Genera doesn't care about safety issues in variable access
because the check can be done by microcode.
Cost to Implementors:
This change does not force a change to any current implementation, except
those which do not yet signal unbound variable or undefined function errors
even in the safest setting.
Cost to Users:
This checking might slow down some code which is written for the safest
setting yet does not need this check.
Any implementation-specific code which depended on these references not
signalling would be broken. Such code was not portable, of course.
Any CLOS code which depends on SLOT-UNBOUND being called even in low safety
settings would be broken. The amount of such code at this point is likely
to be little or none. If such cases did exist, local or global changes to
safety settings would correct the problem (at some cost in speed).
Cost of Non-Adoption:
Some important error checking would not occur.
Some important optimizations could not be done.
The language would seem internally less consistent.
Benefits:
The costs of non-adoption would be avoided.
Aesthetics:
This would regularize things a little.
Discussion:
Pitman thinks this would be a good idea.
∂17-Mar-89 2126 CL-Cleanup-mailer New issue: WITH-OPEN-FILE-DOES-NOT-EXIST
Received: from Sun.COM by SAIL.Stanford.EDU with TCP; 17 Mar 89 21:25:40 PST
Received: from snail.Sun.COM (snail.Corp.Sun.COM) by Sun.COM (4.1/SMI-4.0)
id AA23966; Fri, 17 Mar 89 21:26:15 PST
Received: from denali.sun.com by snail.Sun.COM (4.1/SMI-4.1)
id AA08587; Fri, 17 Mar 89 21:22:28 PST
Received: from localhost by denali.sun.com (3.2/SMI-3.2)
id AA21223; Fri, 17 Mar 89 21:25:44 PST
Message-Id: <8903180525.AA21223@denali.sun.com>
To: cl-cleanup@sail.stanford.edu
Subject: New issue: WITH-OPEN-FILE-DOES-NOT-EXIST
Date: Fri, 17 Mar 89 21:25:42 PST
From: peck@Sun.COM
Really a request for an editorial change, so users will know what
to expect. A user actually reported this as a bug...
Unless someone believes that STREAM-IS-NIL is wrong, this could just
be forwarded to editorial.
Issue: WITH-OPEN-FILE-DOES-NOT-EXIST
References: CLtL page 422
Category: Clarify
Edit history: 17-Mar-89, Version 1
Problem description:
The documentation for WITH-OPEN-FILE (p 422) says:
"WITH-OPEN-FILE evaluates the Forms of the body (an implict PROGN)
with the variable Stream bound to a stream that reads or writes the
file named by the value of Filename. The options are evaluated and
used as keyword arguments to the function OPEN."
It is not clear what to do when there is no stream
"that reads or writes the file" named by Filename.
Is the body evaluated? What is Stream bound to?
Proposal: WITH-OPEN-FILE-DOES-NOT-EXIST:DONT-EVALUATE
If the result of OPEN does not return a stream (eg returns NIL)
Then the body of WITH-OPEN-FILE is not evaluated, NIL is returned.
Rationale:
The contract that "the body is evaluated with ... bound to a stream"
is maintained in the sense of a vacuous evalation.
The alternatives are:
To let the stream variable be bound to NIL (unintuitive and dangerous).
If users want to Signal-An-Error in this case, they can use
:if-does-not-exist :error
The test for (STREAMP Stream) is probably done anyway,
since the UNWIND-PROTECT cleanup form can't call CLOSE on NIL.
Proposal: WITH-OPEN-FILE-DOES-NOT-EXIST:STREAM-IS-NIL
Clarify the documentation to explain that:
Stream is bound to the value returned by OPEN.
Users of :if-does-not-exist NIL should check for a valid stream.
Rationale:
This simple to implement, no extra testing is done.
Users who use :if-does-not-exist NIL can wrap their body forms
with (when (STREAMP Stream) ...)
Examples:
1. (WITH-OPEN-FILE (foo "no-such-file" :IF-DOES-NOT-EXIST nil)
(READ foo) t)
DONT-EVALUATE: => NIL, no I/O is done, do not read from *standard-input*
STREAM-IS-NIL: => T, reads from *standard-input*
2. (WITH-OPEN-FILE (foo "/no-dir" :direction :output :IF-DOES-NOT-EXIST nil)
(format foo t) t)
DONT-EVALUATE: => NIL, no string is created.
STREAM-IS-NIL: => T, creates a string and writes to it.
Current practice:
Symbolics and Lucid apparently implement STREAM-IS-NIL.
Cost to Implementors:
STREAM-IS-NIL: no cost.
DONT-EVALUATE:
Trivial? to test for :if-does-not-exist NIL and supply a
test for (STREAMP Stream) in that case [or in every case].
Cost to Users:
DONT-EVALUATE: System tests for (STREAMP Stream), possibly extraneously.
STREAM-IS-NIL: User must write a test for (STREAMP Stream).
Probably no portable code uses :if-does-not-exist NIL without
testing explicitly for (STREAMP Stream).
Cost of non-adoption:
The current situation is non-intuitive and/or confusing.
Benefits:
Users would know if the STREAMP test has been done or whether
they must supply it.
Esthetics:
Discussion:
∂16-Mar-89 1045 X3J13-mailer DRAFT Issue: CONDITION-RESTARTS (Version 1)
Received: from Xerox.COM by SAIL.Stanford.EDU with TCP; 16 Mar 89 10:44:53 PST
Received: from Semillon.ms by ArpaGateway.ms ; 16 MAR 89 10:30:53 PST
Date: 16 Mar 89 10:24 PST
From: masinter.pa@Xerox.COM
Subject: DRAFT Issue: CONDITION-RESTARTS (Version 1)
To: x3j13@SAIL.Stanford.EDU
line-fold: NO
Message-ID: <890316-103053-4587@Xerox>
There will possibly be a new version of this issue available
at the meeting. Additional comments excerpted at the end...
!
Issue: CONDITION-RESTARTS
Forum: Cleanup
References: Common Lisp Condition System
Category: CHANGE
Edit history: 18-Jan-89, Version 1 by Pitman
Problem Description:
It was noted in the condition system document itself, and many people have
complained privately, that a major weakness of the condition system is the
inability to know whether a particular restart is associated with a
particular signalling action.
The problem being addressed shows itself in situations involving recursive
errors. The programmer wants to make sure that a restart obtained from
FIND-RESTART or COMPUTE-RESTARTS is in fact present for the purpose of
handling some particular error that he is actively focussed on, and not
for some other (outer) error which he was not actively trying to handle.
Proposal (CONDITION-RESTARTS:PERMIT-ASSOCIATION):
1. Define that it is an error for SIGNAL to be called on a condition
more than once.
2. Introduce a function COPY-CONDITION:
COPY-CONDITION condition [Function]
Returns a copy of the given condition.
3. Introduce a macro WITH-CONDITION-RESTARTS which can be used to
dynamically bind the association between a condition and a set
of restarts.
WITH-CONDITION-RESTARTS (condition-form restarts-form) &BODY forms
[Macro]
Evaluates CONDITION-FORM and RESTARTS-FORM, the results of which
should be a condition and a list of restarts, respectively. Then
evaluates the body of forms in implicit-progn style, returning the
last form. While in the dynamic context of the body, the function
COMPUTE-RESTARTS will, when given an argument that was the result
of evaluating the CONDITION-FORM, return the list of restarts that
was the result of evaluating the RESTARTS-FORM.
Only the innermost call to WITH-CONDITION-RESTARTS with a given
condition is relevant. In this way, the set of restarts associated
with a given condition can be dynamically extended or restricted.
Usually this macro is not used explicitly in code, since
SIGNAL-WITH-RESTARTS and ERROR-WITH-RESTARTS handle most of the
common cases in a way that is syntactically more concise.
4. Extend COMPUTE-RESTARTS, FIND-RESTART, ABORT, CONTINUE, USE-VALUE,
and STORE-VALUE to permit an optional condition object as an argument.
When the extra argument is not supplied, these functions behave
exactly as defined before. (Restarts are considered without
prejudice to whether they have been associated with conditions.)
When this argument is supplied, only restarts with the associated
with the given condition are considered. In all other respects, the
behavior is the same.
Passing a condition argument of NIL is treated the same as passing
no condition argument.
5. Add two new macros SIGNAL-WITH-RESTARTS and ERROR-WITH-RESTARTS:
SIGNAL-WITH-RESTARTS condition &rest restart-clauses [Macro]
This does several things:
1. It enters a context in which the indicated RESTART-CLAUSES
are available. They have the same form as the clauses in
a RESTART-CASE.
2. It evaluates CONDITION expression. [This is done after the
restarts are instantiated because the restarts are probably
still useful in the debugger if an error occurs during the
evaluation of the condition.] The result of the evaluation
must be a condition object.
3. It associates the condition which resulted from the evaluation
with the restarts established in step 1, using the equivalent
of WITH-CONDITION-RESTARTS.
4. It calls SIGNAL on the same condition.
ERROR-WITH-RESTARTS condition &rest restart-clauses [Macro]
Like SIGNAL-WITH-RESTARTS but uses ERROR rather than SIGNAL
in step 4.
6. Define that Common Lisp macros such as CHECK-TYPE, which are defined
to signal and to make restarts available, use the equivalent of
WITH-CONDITION-RESTARTS to associate the conditions they signal with
the defined restarts, so that users can make reliable tests not only
for the restarts being available, but also for them being available
for the right reasons.
Rationale:
1. The ability to recycle a condition object (including the ability to
resignal a condition) means that the same condition object might be
simultaneously active for two different purposes. In such a case,
no test (not even EQ) would suffice to determine whether a particular
restart belonged with a particular signalling action, since the
condition could not uniquely identify the signalling action. By saying
that a given condition may only be signalled once, we guarantee that
the condition can serve as a unique identifier for a signalling action.
2. Since there may now be some code which has begun to rely on the ability
to re-signal a condition, COPY-CONDITION will help to make this
transition easier. Instead of
(SIGNAL already-signalled-condition)
one can write:
(SIGNAL (COPY-CONDITION already-signalled-condition))
3. This is is the minimal level of support needed to set up an
association between restarts and conditions.
4. This provides a natural interface for retrieving and using the
information about the associations between conditions and restarts.
5. This provides a natural interface for the most common case of
wanting to signal a restart with some associated conditions.
Test Case:
(HANDLER-BIND ((ERROR #'(LAMBDA (C) (SIGNAL C)))) (SIGNAL "Test"))
was permissible, but this proposal makes it an error.
(DEFUN TEST-CONDITION-STUFF (OFFER-EXTRA-RESTART
USE-CONDITION-ARGUMENT
USE-FOUND-RESTART)
(HANDLER-BIND ((CONDITION
#'(LAMBDA (C)
(LET ((R0 (FIND-RESTART 'USE-VALUE))
(R1 (IF USE-CONDITION-ARGUMENT
(FIND-RESTART 'USE-VALUE C)
(FIND-RESTART 'USE-VALUE))))
(IF (AND R1 USE-FOUND-RESTART)
(INVOKE-RESTART R1 (EQ R0 R1))
(USE-VALUE (EQ R0 R1)))))))
(HANDLER-BIND ((CONDITION
#'(LAMBDA (C)
(USE-VALUE
(IF OFFER-EXTRA-RESTART
(WITH-RESTARTS
(SIGNAL (COPY-CONDITION C))
(USE-VALUE (X) (LIST 'EXTRA X)))
(SIGNAL (COPY-CONDITION C)))))))
(SIGNAL-WITH-RESTARTS (MAKE-CONDITION 'SIMPLE-CONDITION
:FORMAT-STRING "Test")
(USE-VALUE (X) X)))))
Previously, this was an error because it uses non-existent primitives, but
if you assume that
- COPY-CONDITION is implemented in the `obvious' way
- SIGNAL-WITH-RESTARTS just uses WITH-RESTARTS and SIGNAL
- FIND-RESTART ignores its last argument
in the obvious naive ways, it is possible to compare the old and new behavior:
Current Proposed
(TEST-CONDITION-STUFF NIL NIL NIL) => T T
(TEST-CONDITION-STUFF NIL NIL T) => T T
(TEST-CONDITION-STUFF NIL T NIL) => T T
(TEST-CONDITION-STUFF NIL T T) => T T
(TEST-CONDITION-STUFF T NIL NIL) => T (EXTRA T)
(TEST-CONDITION-STUFF T NIL T) => T (EXTRA T)
(TEST-CONDITION-STUFF T T NIL) => T (EXTRA NIL)
(TEST-CONDITION-STUFF T T T) => T NIL
Current Practice:
Presumably no implementation does this yet.
Cost to Implementors:
Several small, relatively modular changes.
Cost to Users:
Except for the change to the recyclability of restarts, this change is
upward compatible.
Probably very few if any users currently take advantage of recycling
restarts, so the cost to users of this change is very slight.
Even in the case where recycling is used, a straightforward rewrite in
terms of COPY-CONDITION is probably feasible.
Cost of Non-Adoption:
Use of restarts would not be nearly as reliable.
Benefits:
It would be possible to write code which was considerably more robust.
Aesthetics:
Some people might consider this proposal to make things slightly better
because it avoids some ambiguities. Others might consider it to make
things slightly worse because it adds additional complexity.
Discussion:
Pitman thinks a change of this sort is important.
!
"CONDITION-RESTARTS:PERMIT-ASSOCIATION looks fine to me.
It would certainly clean things up in some code I'm working on.."
"I strongly favor this proposal; it removes the major objection that I
had to the CL condition system as it developed.
However, I don't favor the COPY-CONDITION function. I don't think it's
necessary. More importantly, you have not proposed any concrete specification
of what it does, and unless someone does, it cannot be included in the
language. Fortunately, I think we can just drop it, as I doubt that any
portable program would use it in any significant way that could not just
as well be done with a tiny amount of code using other existing primitives.
[generally agreed]
" .. how (should) the condition/restart association
might be implemented -- is some kind of alist structure held by a
special variable what was intended, or ought the condition have a
restarts slot? ... it's pretty obvious that the relation should be externally
represented. It's important that the association not be done by a slot
in the condition because if you carry around the condition object after
you're done signalling, you don't want it to contain useless and/or
misleading information about restarts that no longer exist."
"... syntax to SIGNAL-WITH-RESTARTS and
ERROR-WITH-RESTARTS should be:
SIGNAL-WITH-RESTARTS signal-argument-list &rest restart-clauses
ERROR-WITH-RESTARTS signal-argument-list &rest restart-clauses
so that you would write
(SIGNAL-WITH-RESTARTS ('FOOD-COLOR-ERROR :FOOD 'LETTUCE :COLOR 'PINK)
...restart clauses...)
rather than
(SIGNAL-WITH-RESTARTS (MAKE-CONDITION 'FOOD-COLOR-ERROR
:FOOD 'LETTUCE :COLOR 'PINK)
...restart clauses...)
If you wanted to use MAKE-CONDITION, you would then write:
(SIGNAL-WITH-RESTARTS ((MAKE-CONDITION 'FOOD-COLOR-ERROR
:FOOD 'LETTUCE :COLOR 'PINK))
...restart clauses...)
The advantage of what he proposes is that you could write
(SIGNAL-WITH-RESTARTS ("Bad ~S color" 'FOOD)
...restart clauses...)
and a condition object would be created implicitly as with SIGNAL. A
possible disadvantage is that
(SIGNAL-WITH-RESTARTS (FOO BAR BAZ)
...restart clauses...)
might look to someone like the FOO in (FOO BAR BAZ) named a function
rather than a variable. "
"... even better would be
(WITH-CONDITION-RESTARTS signal-form &rest restart-clauses)
where signal-form must be an invocation of SIGNAL, ERROR, WARN, or
perhaps a few others, or a macro that expands into such an invocation.
WITH-CONDITION-RESTARTS must signal an error at all levels of safety if
it does not recognize the signal-form. This is "weird" because it uses
a form for something other than evaluation (but not unprecedented; this
is exactly what SETF does). The advantage is that it just nests with an
existing syntax instead of inventing a new, awkward syntax.
Note that I stole the "good name" WITH-CONDITION-RESTARTS for this
commonly used syntax. The less commonly used primitive that just sets
up the restarts without signalling doesn't need as good a name."
"... the syntax for WITH-CONDITION-RESTARTS should be
WITH-CONDITION-RESTARTS condition-form restarts-form &BODY forms
rather than
WITH-CONDITION-RESTARTS (condition-form restarts-form) &BODY forms
which it is now. Does anyone else have an opinion?
This is probably a good idea. I'd probably name this one
WITH-CONDITION-RESTARTS-INTERNAL. But are we sure that this operation
needs to be named in the standard
"
∂25-Mar-89 2239 X3J13-mailer **DRAFT** Issue: ERROR-CHECKING-IN-NUMBERS-CHAPTER (Version 1)
Received: from STONY-BROOK.SCRC.Symbolics.COM (SCRC-STONY-BROOK.ARPA) by SAIL.Stanford.EDU with TCP; 25 Mar 89 22:38:54 PST
Received: from BOBOLINK.SCRC.Symbolics.COM by STONY-BROOK.SCRC.Symbolics.COM via CHAOS with CHAOS-MAIL id 565475; Sun 26-Mar-89 01:38:41 EST
Date: Sun, 26 Mar 89 01:38 EST
From: Kent M Pitman <KMP@STONY-BROOK.SCRC.Symbolics.COM>
Subject: **DRAFT** Issue: ERROR-CHECKING-IN-NUMBERS-CHAPTER (Version 1)
To: X3J13@SAIL.Stanford.EDU
Message-ID: <890326013810.9.KMP@BOBOLINK.SCRC.Symbolics.COM>
>>> PLEASE DO -NOT- REPLY TO THIS ISSUE <<<
It's too late for any discussion, but this is just for the information
of anyone still tracking mail at this late date. If you have comments,
just bring them to the meeting. Thanks.
-kmp
-----
Issue: ERROR-CHECKING-IN-NUMBERS-CHAPTER
Forum: Cleanup
References: Numbers (pp193-232),
S:>kmp>cl>conditions>revision-18-notes.text.34
(formerly S:>kmp>cl-conditions.text.34),
Category: CHANGE
Edit history: 06-Mar-89, Version 1 by Pitman
Status: For Internal Discussion
Problem Description:
In many cases, CLtL specifies ``is an error'' situations for functions
in places that programmers would prefer to see an error signalled.
Reliably signalling an error accomplishes two things...
- It eases the development process by making it easier to notice errors
in a timely fashion.
- It makes it easier for code to reliably handle errors at runtime,
leading to greater robustness in delivered applications.
Proposal (ERROR-CHECKING-IN-NUMBERS-CHAPTER:SCARECROW): - a straw man...
ABS should signal TYPE-ERROR if its argument is not type NUMBER.
ACOS should signal TYPE-ERROR if its argument is not type NUMBER.
ACOS might signal ARITHMETIC-ERROR.
ACOSH should signal TYPE-ERROR if its argument is not type NUMBER.
ACOSH might signal ARITHMETIC-ERROR.
ASH should signal TYPE-ERROR if either argument is not type INTEGER.
ASH might signal ARITHMETIC-ERROR.
ASIN should signal TYPE-ERROR if its argument is not type NUMBER.
ASIN might signal ARITHMETIC-ERROR.
ASINH should signal TYPE-ERROR if its argument is not type NUMBER.
ASINH might signal ARITHMETIC-ERROR.
ATAN should signal TYPE-ERROR if exactly one argument is given and that
argument is not type NUMBER.
ATAN should signal TYPE-ERROR if exactly two arguments are given and
either argument is not type (AND NUMBER (NOT COMPLEX)).
ATAN might signal ARITHMETIC-ERROR.
ATANH should signal TYPE-ERROR if its argument is not type NUMBER.
ATANH might signal ARITHMETIC-ERROR.
BOOLE should signal TYPE-ERROR if its first argument is not type
(MEMBER #.BOOLE-CLR #.BOOLE-SET #.BOOLE-1 #.BOOLE-2
#.BOOLE-C1 #.BOOLE-C2 #.BOOLE-AND #.BOOLE-IOR
#.BOOLE-XOR #.BOOLE-EQV #.BOOLE-NAND #.BOOLE-NOR
#.BOOLE-ANDC1 #.BOOLE-ANDC2 #.BOOLE-ORC1 #.BOOLE-ORC2)
BOOLE should signal TYPE-ERROR if either its second or third
argument is not type INTEGER.
BYTE should signal TYPE-ERROR if either argument is not type INTEGER.
BYTE-POSITION might signal TYPE-ERROR if its argument is not a byte
specifier (something that was returned by the BYTE
function). Note that byte specifiers are not required
to be disjoint from other types, so this error checking
is only heuristic.
BYTE-SIZE might signal TYPE-ERROR if its argument is not a byte
specifier (something that was returned by the BYTE
function). Note that byte specifiers are not required
to be disjoint from other types, so this error checking
is only heuristic.
CEILING should signal TYPE-ERROR if its first argument is not type
(AND NUMBER (NOT COMPLEX)).
CEILING should signal TYPE-ERROR if its second argument is supplied
but is not type (AND NUMBER (NOT COMPLEX)).
CEILING should signal DIVISION-BY-ZERO if its second argument is
supplied and is zero.
CEILING might signal ARITHMETIC-ERROR.
COMPLEX should signal TYPE-ERROR if its first argument is not type
(AND NUMBER (NOT COMPLEX)).
COMPLEX should signal TYPE-ERROR if its second argument is provided
but is not type (AND NUMBER (NOT COMPLEX)).
CONJUGATE should signal TYPE-ERROR if its argument is not type NUMBER.
CIS should signal TYPE-ERROR if its argument is not type
(AND NUMBERP (NOT COMPLEX)).
CIS might signal ARITHMETIC-ERROR.
COS should signal TYPE-ERROR if its argument is not type NUMBER.
COS might signal ARITHMETIC-ERROR.
COSH should signal TYPE-ERROR if its argument is not type NUMBER.
COSH might signal ARITHMETIC-ERROR.
DECF might signal SYNTAX-ERROR at semantic processing time.
DECF should signal TYPE-ERROR at runtime if the variable to be
incremented does not have a value of type NUMBER.
DECF might signal ARITHMETIC-ERROR at runtime.
DECODE-FLOAT should signal TYPE-ERROR if its argument is not type FLOAT.
DENOMINATOR should signal TYPE-ERROR if its argument is not type RATIONAL.
DEPOSIT-FIELD should signal TYPE-ERROR if its first argument is not type
INTEGER.
DEPOSIT-FIELD might signal TYPE-ERROR if its second argument is not a
bytespec (something returned by BYTE). Note that byte
specifiers are not required to be disjoint from other types,
so this error checking is only heuristic.
DEPOSIT-FIELD should signal TYPE-ERROR if its third argument is not type
INTEGER.
DPB should signal TYPE-ERROR if its first argument is not type INTEGER.
DPB might signal TYPE-ERROR if its second argument is not a bytespec
(something returned by BYTE). Note that byte specifiers are not
required to be disjoint from other types, so this error checking
is only heuristic.
DPB should signal TYPE-ERROR if its third argument is not type INTEGER.
EVENP should signal TYPE-ERROR if its argument is not type INTEGER.
FCEILING should signal TYPE-ERROR if its first argument is not type
(AND NUMBER (NOT COMPLEX)).
FCEILING should signal TYPE-ERROR if its second argument is supplied
but is not type (AND NUMBER (NOT COMPLEX)).
FCEILING should signal DIVISION-BY-ZERO if its second argument is
supplied and is zero.
FCEILING might signal ARITHMETIC-ERROR.
FFLOOR should signal TYPE-ERROR if its first argument is not type
(AND NUMBER (NOT COMPLEX)).
FFLOOR should signal TYPE-ERROR if its second argument is supplied
but is not type (AND NUMBER (NOT COMPLEX)).
FFLOOR should signal DIVISION-BY-ZERO if its second argument is
supplied and is zero.
FFLOOR might signal ARITHMETIC-ERROR.
FLOOR should signal TYPE-ERROR if its first argument is not type
(AND NUMBER (NOT COMPLEX)).
FLOOR should signal TYPE-ERROR if its second argument is supplied
but is not type (AND NUMBER (NOT COMPLEX)).
FLOOR should signal DIVISION-BY-ZERO if its second argument is
supplied and is zero.
FLOOR might signal ARITHMETIC-ERROR.
FROUND should signal TYPE-ERROR if its first argument is not type
(AND NUMBER (NOT COMPLEX)).
FROUND should signal TYPE-ERROR if its second argument is supplied
but is not type (AND NUMBER (NOT COMPLEX)).
FROUND should signal DIVISION-BY-ZERO if its second argument is
supplied and is zero.
FROUND might signal ARITHMETIC-ERROR.
FTRUNCATE should signal TYPE-ERROR if its first argument is not type
(AND NUMBER (NOT COMPLEX)).
FTRUNCATE should signal TYPE-ERROR if its second argument is supplied
but is not type (AND NUMBER (NOT COMPLEX)).
FTRUNCATE should signal DIVISION-BY-ZERO if its second argument is
supplied and is zero.
FTRUNCATE might signal ARITHMETIC-ERROR.
GCD should signal TYPE-ERROR if any argument is not type INTEGER.
GCD might signal ARITHMETIC-ERROR.
EXP should signal TYPE-ERROR if its argument is not type NUMBER.
EXP might signal ARITHMETIC-ERROR.
EXPT should signal TYPE-ERROR if either argument is not type NUMBER.
EXPT might signal ARITHMETIC-ERROR. e.g., (EXPT 0 0.0)
FLOAT should signal TYPE-ERROR if its first argument is not type
(AND NUMBER (NOT COMPLEX)).
FLOAT should signal TYPE-ERROR if its second argument is supplied
but is not type FLOAT.
FLOAT might signal ARITHMETIC-ERROR.
FLOAT-DIGITS should signal TYPE-ERROR if its argument is not type FLOAT.
FLOAT-PRECISION should signal TYPE-ERROR if its argument is not type FLOAT.
FLOAT-RADIX should signal TYPE-ERROR if its argument is not type FLOAT.
FLOAT-SIGN should signal TYPE-ERROR if its first argument is not type
FLOAT.
FLOAT-SIGN should signal TYPE-ERROR if its second argument is supplied
but is not type FLOAT.
INCF might signal SYNTAX-ERROR at semantic processing time.
INCF should signal TYPE-ERROR at runtime if the variable to be
incremented does not have a value of type NUMBER.
INCF might signal ARITHMETIC-ERROR at runtime.
INTEGER-DECODE-FLOAT should signal TYPE-ERROR if its argument is not
type FLOAT.
INTEGER-LENGTH should signal TYPE-ERROR if its argument is not type INTEGER.
IMAGPART should signal TYPE-ERROR if its argument is not type NUMBER.
ISQRT should signal TYPE-ERROR if its argument is not type (INTEGER 0).
ISQRT might signal ARITHMETIC-ERROR.
LCM should signal TYPE-ERROR if any argument is not type INTEGER.
LCM might signal ARITHMETIC-ERROR.
LDB might signal TYPE-ERROR if its first argument is not a bytespec
(something returned by BYTE). Note that byte specifiers are not
required to be disjoint from other types, so this error checking
is only heuristic.
LDB should signal TYPE-ERROR if its second argument is not type INTEGER.
LDB-TEST might signal TYPE-ERROR if its first argument is not a bytespec
(something returned by BYTE). Note that byte specifiers are not
required to be disjoint from other types, so this error checking
is only heuristic.
LDB-TEST should signal TYPE-ERROR if its second argument is not type INTEGER.
LOG should signal TYPE-ERROR if either argument is not type NUMBER.
LOG might signal ARITHMETIC-ERROR.
LOGAND should signal TYPE-ERROR if any argument is not type INTEGER.
LOGANDC1 should signal TYPE-ERROR if either argument is not type INTEGER.
LOGANDC2 should signal TYPE-ERROR if either argument is not type INTEGER.
LOGBITP should signal TYPE-ERROR if its first argument is not type
(INTEGER 0).
LOGBITP should signal TYPE-ERROR if its second argument is not type
INTEGER.
LOGCOUNT should signal TYPE-ERROR error if its argument is not type INTEGER.
LOGEQV should signal TYPE-ERROR if any argument is not type INTEGER.
LOGIOR should signal TYPE-ERROR if any argument is not type INTEGER.
LOGNAND should signal TYPE-ERROR if either argument is not type INTEGER.
LOGNOR should signal TYPE-ERROR if either argument is not type INTEGER.
LOGNOT should signal TYPE-ERROR error if its argument is not type INTEGER.
LOGORC1 should signal TYPE-ERROR if either argument is not type INTEGER.
LOGORC2 should signal TYPE-ERROR if either argument is not type INTEGER.
LOGTEST should signal TYPE-ERROR if either argument is not type INTEGER.
LOGXOR should signal TYPE-ERROR if any argument is not type INTEGER.
MAKE-RANDOM-STATE should signal TYPE-ERROR if an argument is supplied
but is not type (OR (MEMBER NIL T) RANDOM-STATE).
MASK-FIELD might signal TYPE-ERROR if its first argument is not a bytespec
(something returned by BYTE). Note that byte specifiers are not
required to be disjoint from other types, so this error checking
is only heuristic.
MASK-FIELD should signal TYPE-ERROR if its second argument is not type
INTEGER.
MAX should signal TYPE-ERROR if any argument is not type
(AND NUMBERP (NOT COMPLEX)).
MAX might signal ARITHMETIC-ERROR.
MIN should signal TYPE-ERROR if any argument is not type
(AND NUMBERP (NOT COMPLEX)).
MIN might signal ARITHMETIC-ERROR.
MINUSP should signal TYPE-ERROR if its argument is not type
(AND NUMBER (NOT COMPLEX)).
MOD should signal TYPE-ERROR if its first argument is not type
(AND NUMBER (NOT COMPLEX)).
MOD should signal TYPE-ERROR if its second argument is supplied
but is not type (AND NUMBER (NOT COMPLEX)).
MOD should signal DIVISION-BY-ZERO if its second argument is
supplied and is zero.
MOD might signal ARITHMETIC-ERROR.
NUMERATOR should signal TYPE-ERROR if its argument is not type RATIONAL.
ODDP should signal TYPE-ERROR if its argument is not type INTEGER.
PHASE should signal TYPE-ERROR if its argument is not type NUMBER.
PHASE might signal ARITHMETIC-ERROR.
PLUSP should signal TYPE-ERROR if its argument is not type
(AND NUMBER (NOT COMPLEX)).
RANDOM should signal TYPE-ERROR if its first argument is not
type (INTEGER 1).
RANDOM should signal TYPE-ERROR if its second argument is supplied
but is not type RANDOM-STATE.
RANDOM-STATE-P will never signal an error.
RATIONAL should signal TYPE-ERROR if its argument is not type
(AND NUMBER (NOT COMPLEX)).
RATIONAL might signal ARITHMETIC-ERROR.
RATIONALIZE should signal TYPE-ERROR if its argument is not type
(AND NUMBER (NOT COMPLEX)).
RATIONALIZE might signal ARITHMETIC-ERROR.
REALPART should signal TYPE-ERROR if its argument is not type NUMBER.
REM should signal TYPE-ERROR if its first argument is not type
(AND NUMBER (NOT COMPLEX)).
REM should signal TYPE-ERROR if its second argument is supplied
but is not type (AND NUMBER (NOT COMPLEX)).
REM should signal DIVISION-BY-ZERO if its second argument is
supplied and is zero.
REM might signal ARITHMETIC-ERROR.
ROUND should signal TYPE-ERROR if its first argument is not type
(AND NUMBER (NOT COMPLEX)).
ROUND should signal TYPE-ERROR if its second argument is supplied
but is not type (AND NUMBER (NOT COMPLEX)).
ROUND should signal DIVISION-BY-ZERO if its second argument is
supplied and is zero.
ROUND might signal ARITHMETIC-ERROR.
SCALE-FLOAT should signal TYPE-ERROR if its first argument is not
type FLOAT.
SCALE-FLOAT should signal TYPE-ERROR if its second argument is not
type INTEGER.
SIGNUM should signal TYPE-ERROR if its argument is not type NUMBER.
SIN should signal TYPE-ERROR if its argument is not type NUMBER.
SIN might signal ARITHMETIC-ERROR.
SINH should signal TYPE-ERROR if its argument is not type NUMBER.
SINH might signal ARITHMETIC-ERROR.
SQRT should signal TYPE-ERROR if its argument is not type NUMBER.
SQRT might signal ARITHMETIC-ERROR.
TAN should signal TYPE-ERROR if its argument is not type NUMBER.
TAN might signal ARITHMETIC-ERROR.
TANH should signal TYPE-ERROR if its argument is not type NUMBER.
TANH might signal ARITHMETIC-ERROR.
TRUNCATE should signal TYPE-ERROR if its first argument is not type
(AND NUMBER (NOT COMPLEX)).
TRUNCATE should signal TYPE-ERROR if its second argument is supplied
but is not type (AND NUMBER (NOT COMPLEX)).
TRUNCATE should signal DIVISION-BY-ZERO if its second argument is
supplied and is zero.
TRUNCATE might signal ARITHMETIC-ERROR.
ZEROP should signal TYPE-ERROR if its argument is not type NUMBER.
* should signal TYPE-ERROR if any argument is not type NUMBER.
* might signal ARITHMETIC-ERROR.
+ should signal TYPE-ERROR if any argument is not type NUMBER.
+ might signal ARITHMETIC-ERROR.
- should signal TYPE-ERROR if any argument is not type NUMBER.
- might signal ARITHMETIC-ERROR.
/ should signal TYPE-ERROR if any argument is not type NUMBER.
/ should signal DIVISION-BY-ZERO if any divisor argument is zero.
/ might signal ARITHMETIC-ERROR.
/= should signal type-error if any argument is not type NUMBER.
/= might signal ARITHMETIC-ERROR.
1+ should signal TYPE-ERROR if any argument is not type NUMBER.
1+ might signal ARITHMETIC-ERROR.
1- should signal TYPE-ERROR if any argument is not type NUMBER.
1- might signal ARITHMETIC-ERROR.
< should signal TYPE-ERROR if any argument is not type
(AND NUMBER (NOT COMPLEX)).
< might signal ARITHMETIC-ERROR.
<= should signal TYPE-ERROR if any argument is not type
(AND NUMBER (NOT COMPLEX)).
<= might signal ARITHMETIC-ERROR.
= should signal TYPE-ERROR if any argument is not type NUMBER.
= might signal ARITHMETIC-ERROR.
> should signal TYPE-ERROR if any argument is not type
(AND NUMBER (NOT COMPLEX)).
> might signal ARITHMETIC-ERROR.
>= should signal TYPE-ERROR if any argument is not type
(AND NUMBER (NOT COMPLEX)).
>= might signal ARITHMETIC-ERROR.
Rationale:
This addresses the development and delivery concerns mentioned in the
Problem Description.
Current Practice:
Most implementations probably do not reliably signal the indicated
errors in safe code.
Cost to Implementors:
In implementations not providing the indicated error checking,
considerable work might need to be done.
The alternative is to identify the implementation as an "unsafe" subset.
However, while it is a "subset" in the sense that code that was developed
in it will run in the superset, it is important to understand that such
implementations are not simply "places you can run code that's been
thoroughly debugged in the full language" since such debugged code may
still depend on the reliable detection and handling of certain kinds of
errors.
Cost to Users:
Technically none. These are supposedly already all `is an error'
cases that people can't depend on.
Some users might be relying on an implementation not to signal a particular
error in compiled code. Such code is already not portable, however.
In some cases, where an implementation adds error checking that they
consider unnecessary, the user will need to add some OPTIMIZE proclamations.
Some users will see this as a bug fix.
Cost of Non-Adoption:
The error handling facilities will be a lot less useful.
Code that uses error handling will not port well.
Benefits:
Better development environments. More robust delivered applications.
Aesthetics:
Hopefully improved.
Discussion:
Pitman getting this level of detail is a good idea. He's ammenable to
specific changes if they improve the overall level of receptiveness to
the proposal.
Notes about how to proceed on this issue
(to be removed prior to final vote):
- Is anyone uncomfortable about the name ARITHMETIC-ERROR?
If so, can someone mathematically inclined suggest a better name?
`MATH-ERROR' or `NUMBER-ERROR' come to mind.
- In some of the cases of "might signal", it might be the case that
no signal should ever occur. Someone who's actually implemented these
functions might want to suggest that in some cases we can remove
this verbiage, or give examples of the circumstances under which the
condition might be signalled?
∂23-Mar-89 1504 X3J13-mailer **DRAFT** Issue: PATHNAME-CANONICAL-TYPE (Version 1)
Received: from STONY-BROOK.SCRC.Symbolics.COM (SCRC-STONY-BROOK.ARPA) by SAIL.Stanford.EDU with TCP; 23 Mar 89 12:07:40 PST
Received: from BOBOLINK.SCRC.Symbolics.COM by STONY-BROOK.SCRC.Symbolics.COM via INTERNET with SMTP id 563815; 23 Mar 89 15:06:56 EST
Date: Thu, 23 Mar 89 15:06 EST
From: Kent M Pitman <KMP@STONY-BROOK.SCRC.Symbolics.COM>
Subject: **DRAFT** Issue: PATHNAME-CANONICAL-TYPE (Version 1)
To: X3J13@SAIL.Stanford.EDU
Message-ID: <890323150638.0.KMP@BOBOLINK.SCRC.Symbolics.COM>
>>> PLEASE DO -NOT- REPLY TO THIS ISSUE <<<
Bring your comments to the meeting.
See summary of CL-Cleanup discussion at end of message.
-kmp
-----
Issue: PATHNAME-CANONICAL-TYPE
References: MAKE-PATHNAME (p416)
Category: ADDITION
Edit history: 07-Jul-88, Version 1 by Pitman
Status: For Internal Discussion
Related-Issues: PATHNAME-COMPONENT-CASE
Problem Description:
The pathame-type of ``Lisp'' and ``Compiled Lisp'' files vary widely from
implementation to implementation.
"LSP" is common on Vax VMS. "lisp" is generally used for the Symbolics
file system. "l" and "lisp" are common on Unix. Some Lisp implementations
use customized extensions such as "cl" or even "jcl" (eg, for "Joe's CL").
It would be useful to probe the existence of either a source or a binary
file, but that cannot currently be done portably. Furthermore, it would be
useful to create certain standard kinds of files in a system-independent
fashion.
A common desire, for example, is to do
(DEFUN FILE-NEEDS-TO-BE-COMPILED (FILE)
(LET ((SOURCE (PROBE-FILE
(MERGE-PATHNAMES FILE (MAKE-PATHNAME :TYPE ???))))
(BINARY (PROBE-FILE
(MERGE-PATHNAMES FILE (MAKE-PATHNAME :TYPE ???)))))
... (FILE-WRITE-DATE SOURCE) ... (FILE-WRITE-DATE BINARY) ...))
The problem is that there's nothing portable to put in the ??? positions.
Indeed, depending on the host (ie, file system) of the pathname, the
type might need to differ even in the same Lisp implementation. For example,
Symbolics Genera stores its source files in names like "foo.l" on Unix,
"FOO.LSP" on VMS, etc.
Proposal (PATHNAME-CANONICAL-TYPE:NEW-CONCEPT):
In addition to the normal strings and keywords currently allowed as fillers
of the TYPE field of a pathname, allow other keywords which designate
``canonical types''.
A canonical type is translated to a real type by MAKE-PATHNAME so that the
(PATHNAME-TYPE (MAKE-PATHNAME :TYPE canonical-type)) is a string.
Introduce a new function PATHNAME-CANONICAL-TYPE which returns the canonical
type of an argument pathname, or the type if there is no canonical type.
For example,
(PATHNAME-CANONICAL-TYPE (MAKE-PATHNAME :TYPE :LISP)) => :LISP
[This information may be explicitly represented as an additional slot, or
computed on demand using a lookup table, as the implementor prefers.]
Define the following standard types:
:LISP ``Lisp'' (source) file
:BIN ``Compiled Lisp'' (object) file
Permit implementations to extend the set of canonical type names.
Test Case:
(PATHNAME-TYPE (MAKE-PATHNAME :TYPE :LISP))
=> "LSP" ;Typically, on VMS
=> "l" or "lisp" ;Typically, on Unix
=> "L" or "LISP" ;Typically, on Unix
; (assuming PATHNAME-COMPONENT-CASE:CANONICALIZE adopted)
..etc.
(PATHNAME-TYPE (MAKE-PATHNAME :TYPE :BIN))
=> "FAS" ;eg, VAXLISP
=> "BIN" ;eg, Symbolics file system
...etc.
(PATHNAME-CANONICAL-TYPE (MAKE-PATHNAME :TYPE :LISP)) => :LISP
(PATHNAME-CANONICAL-TYPE (MAKE-PATHNAME :TYPE "LSP"))
=> :LISP ;eg, VAXLISP
=> "LSP" ;eg, Unix
Rationale:
This is a useful subset of the functionality already available in
at least one implementation.
Current Practice:
Symbolics Genera implements this proposal.
Cost to Implementors:
The cost of implementing these proposed features is very slightly.
MAKE-PATHNAME would have to change to coerce its :TYPE argument in implementations
where it does not do so already. PATHNAME-CANONICAL-TYPE can be implemented as a
fairly straightforward lookup.
Cost to Users:
None. This change is upward compatible.
Cost of Non-Adoption:
It would continue to be hard to portably name files when their types
differed from file system to file system.
Benefits:
The cost of non-adoption would be avoided.
Aesthetics:
Some programs would be able to abstract away from the particulars of the host
file system entirely. Some people believe this would be a definite improvement
in aesthetics.
Discussion:
Note that different Lisp implementations which share the same file system,
need not and perhaps should not agree on the same type string for the
canonical type :BIN. That is, if I store source files on VAX VMS and compile
them both for use under Symbolics Genera and VAXLISP, then it is both
appropriate and useful that VAXLISP :BIN files be named "something.FAS"
and Genera :BIN files be named "something.BIN" since then they wouldn't
clobber each other.
Pitman supports PATHNAME-CANONICAL-TYPE:NEW-CONCEPT.
-------
Summary of discussion on CL-Cleanup:
GZ suggested :COMPILED-LISP was suggested as a better name than :BIN.
Masinter thought :SOURCE-LISP might be better than :LISP. Either of these
would be gratuitously incompatible with Symbolics Genera, which already
implements canonical types, but otherwise not technically unreasonable
and probably something we should discuss.
Sandra Loosemore offered the following revealing piece of code from her
work and asked why we couldn't just do this.
(defvar *binary-file-type*
#+Symbolics (make-pathname :type "bin")
#+(and dec common vax (not ultrix)) (make-pathname :type "FAS")
#+(and dec common vax ultrix) (make-pathname :type "fas")
#+pcls (make-pathname :type "b")
#+KCL (make-pathname :type "o")
#+Xerox (make-pathname :type "dfasl")
#+(and Lucid MC68000) (make-pathname :type "lbin")
#+(and Lucid VAX VMS) (make-pathname :type "vbin")
#+excl (make-pathname :type "fasl")
#+system::cmu (make-pathname :type "sfasl")
#+PRIME (make-pathname :type "pbin")
#+HP (make-pathname :type "b")
#+TI (make-pathname :type "xfasl")
"The default file type for compiled files.")
The reason is that some implementations (e.g., Symbolics) deal with more
than one file system type -- and properly the information varies with the
file system type, not with the implementations. [Since most implementations
have only one associated file system type, this may not be obvious, but it's
quite obvious on a Symbolics machine that you vary the extension name based
on the host file system requirements.]
Moon suggested a compromise where *compile-file-output-type* (his name
for Sandra's *binary-file-type*) existed but could be either a canonical
type or a physical type.
Masinter worries about the PATHNAME-CANONICAL-TYPE part of the proposal
being forced to be heuristic in some cases. [Will any alternative be any
less heuristic? -kmp]
Moon wanted the following example to be guaranteed to work:
(PATHNAME-CANONICAL-TYPE (PATHNAME "foo.lisp")) => :LISP
where of course the string is implementation-dependent. That is,
PATHNAME-CANONICAL-TYPE must produce a canonical type even when the
pathname was not constructed from a canonical type, but instead came
from user typein, the TRUENAME function, the DIRECTORY function,
or some similar source, when the pathname's type is one that a
canonical type maps into.
Moon also thought it would be nice to have a facility for users
(in addition to implementations) to extend the set of canonical type
names, since users may well have their own types of files. However,
he admitted that the difficulty is that in any system that supports
multiple file systems, it has to be complex enough to allow
specification of separate mappings for each file system, which in
turn requires a way to name file system types. [At this point, we
probably don't have time left in our schedule to produce such a
facility. -kmp]
∂16-Jun-89 2239 X3J13-mailer Issue: PATHNAME-COMPONENT-CASE (version 5)
Received: from VALLECITO.SCRC.Symbolics.COM by SAIL.Stanford.EDU with TCP; 16 Jun 89 22:38:34 PDT
Received: from EUPHRATES.SCRC.Symbolics.COM by VALLECITO.SCRC.Symbolics.COM via CHAOS with CHAOS-MAIL id 296248; Sat 17-Jun-89 01:09:50 EDT
Date: Sat, 17 Jun 89 01:08 EDT
From: David A. Moon <Moon@STONY-BROOK.SCRC.Symbolics.COM>
Reply-To: CL-Cleanup@sail.stanford.edu
Subject: Issue: PATHNAME-COMPONENT-CASE (version 5)
To: X3J13@sail.stanford.edu
Message-ID: <19890617050833.8.MOON@EUPHRATES.SCRC.Symbolics.COM>
Issue: PATHNAME-COMPONENT-CASE
References: Pathnames (pp410-413),
MAKE-PATHNAME (p416),
PATHNAME-HOST (p417),
PATHNAME-DEVICE (p417),
PATHNAME-DIRECTORY (p417),
PATHNAME-NAME (p417),
PATHNAME-TYPE (p417)
Related-issues: PATHNAME-WILD
Category: CHANGE
Edit history: 1-Jul-88, Version 1 by Pitman
22-Mar-89, Version 2 by Moon, update and rewrite
9-May-89, Version 3 by Moon, remove alternate proposals
9-May-89, Version 4 by Moon, respond to discussion with KMP
17-Jun-89, Version 5 by Moon, fix typo, make minor improvements
to the presentation.
Problem Description:
Issues of alphabetic case in pathnames are a major source of problems.
In some file systems, the customary case is lowercase, in some uppercase,
in some mixed. In some file systems, case matters, in others it does
not.
There are two kinds of pathname case portability problems: moving
programs from one Common Lisp to another, and moving pathname component
values from one file system to another. To solve the first problem, all
Common Lisp implementations that support a particular file system must
use compatible representations for pathname component values. To solve
the second problem, there must be a common representation for the least
common denominator pathname component values that exist on all
interesting file systems.
This desire for a common representation directly conflicts with the
desire among programmers who only use one file system to work with the
local conventions and not think about issues of porting to other file
systems. The common representation cannot be the same as every local
convention, since they vary.
In the current anarchy of pathname component case conventions:
(NAMESTRING (MAKE-PATHNAME :NAME "FOO" :TYPE "LISP"))
will produce foo.lisp in some Unix Common Lisp implementations
and will produce FOO.LISP in other Unix Common Lisp implementations.
(NAMESTRING (MAKE-PATHNAME :NAME "foo" :TYPE "lisp"))
will produce FOO.LISP in some Tops-20 Common Lisp implementations
and will produce "↑Vf↑Vo↑Vo.↑Vl↑Vi↑Vs↑Vp"in other Tops-20 Common
Lisp implementations.
Problems like this make it difficult to use MAKE-PATHNAME for much of
anything without corrective (non-portable) code.
Other problems occur in merging because doing
(NAMESTRING (MERGE-PATHNAMES (MAKE-PATHNAME :HOST "MY-TOPS-20" :NAME "FOO")
(PARSE-NAMESTRING "MY-UNIX:x.lisp")))
should probably return "MY-TOPS-20:FOO.LISP" but in fact might return
"MY-TOPS-20:FOO.↑Vl↑Vi↑Vs↑Vp" in some implementations.
Problems like this make it difficult to use any merging primitives for
much of anything without corrective (non-portable) code.
Proposal (PATHNAME-COMPONENT-CASE:KEYWORD-ARGUMENT):
Add a keyword argument :CASE to MAKE-PATHNAME, PATHNAME-HOST,
PATHNAME-DEVICE, PATHNAME-DIRECTORY, PATHNAME-NAME, and PATHNAME-TYPE.
The possible values for the argument are :COMMON and :LOCAL.
:LOCAL means strings input to MAKE-PATHNAME or output by PATHNAME-xxx
follow the local file system's conventions for alphabetic case.
Strings given to MAKE-PATHNAME will be used exactly as written if
the file system supports both cases. If the file system only
supports one case, the strings will be translated to that case.
:COMMON means strings input to MAKE-PATHNAME or output by PATHNAME-xxx
follow this common convention:
- all uppercase means to use a file system's customary case.
- all lowercase means to use the opposite of the customary case.
- mixed case represents itself.
The second and third bullets exist so that translation from local to
common and back to local is information-preserving.
The default is :COMMON.
Namestrings always use local file system case conventions.
MERGE-PATHNAMES and TRANSLATE-WILD-PATHNAME map customary case in the
input pathnames into customary case in the output pathname.
Implications of the proposal:
Unix is case-sensitive and prefers lowercase, so it translates between
common and local by inverting the case of non-mixed-case strings.
Tops-20 is case-sensitive and prefers uppercase, so it uses identical
representations for common and local.
VAX/VMS is upper-case-only (that is, the file system translates all file
name arguments to upper case), so it translates common to local by
upcasing, and translates local to common with no change.
Macintosh is case-insensitive and prefers lowercase, so it translates
between common and local by inverting the case of non-mixed-case strings,
and ignores case in EQUAL of two pathnames.
Test Case/Examples:
(PATHNAME-NAME (PARSE-NAMESTRING "MY-UNIX:/me/foo.lisp")
:CASE :COMMON) => "FOO"
(PATHNAME-NAME (PARSE-NAMESTRING "MY-TOPS-20:<ME>FOO.LISP")
:CASE :COMMON) => "FOO"
(PATHNAME-NAME (PARSE-NAMESTRING "MY-UNIX:/me/foo.lisp")
:CASE :LOCAL) => "foo"
(PATHNAME-NAME (PARSE-NAMESTRING "MY-TOPS-20:<ME>FOO.LISP")
:CASE :LOCAL) => "FOO"
(PATHNAME-NAME (PARSE-NAMESTRING "MY-UNIX:/me/TeX.lisp")
:CASE :COMMON) => "TeX"
(PATHNAME-NAME (PARSE-NAMESTRING "MY-UNIX:/me/TeX.lisp")
:CASE :LOCAL) => "TeX"
(NAMESTRING (MAKE-PATHNAME :HOST "MY-UNIX" :NAME "FOO"
:CASE :COMMON) => "MY-UNIX:foo"
Rationale:
This does not solve the whole pathname problem, but it does improve
the situation for a clearly defined set of very common problems.
Together with the other pathname proposals, the behavior of pathnames
should be sufficiently consistent across Common Lisp implementations
and across file systems to allow portability of pathname-manipulating
programs.
The current situation where different implementations talk about
the *same* file system in different ways will be corrected by this
and some of the other pathname proposals.
Upper case is chosen as the common case for no better reason than
consistency with Lisp symbols.
The :CASE keyword argument provides access to both common and local
conventions without introducing any new functions. The default
convention is the common one, assuming that most programs are fully
portable and therefore :COMMON will be more frequently used.
Current Practice:
There are no known implementations of exactly what is proposed.
Symbolics Genera uses common case normally, and provides a way to
access the local case (called "raw") that in practice is rarely used.
Symbolics Genera's own file system is case-insensitive and uses lower
case as the customary case, but transparent network access is available
to file systems using all known case conventions.
Several Common Lisp implementations behave as if :CASE :LOCAL was
specified (but accept no :CASE argument).
Cost to Implementors:
The :CASE feature is easily added, but some implementations may have
to change the default behavior when :CASE is not specified. No
implementation need change its internal representation, nor the way
pathnames print, just the interface functions listed above.
Cost to Users:
Technically, this change is upward compatible.
In fact, since the existing CLtL spec is so poor, nearly everyone relies
heavily on implementation-specific behavior since there is little other
choice. As such, any change is almost certain to break lots of programs,
in usually superficial but nevertheless important ways. However, if we
really make the pathname facility more portable, the user community may
be willing to bear the consequences of these changes.
Cost of Non-Adoption:
We would be contributing to the perpetuation of the existing fiasco of a
pathname system.
Performance Impact:
None.
Benefits:
One step closer to a usable pathname system.
Aesthetics:
Anything that simplifies the user model of pathnames is an improvement.
Discussion:
Some people would rather use lowercase as the common case. The
decision is essentially arbitrary. Everywhere else in Common Lisp
where case matters, uppercase was chosen.
It has been proposed that the Common Lisp specification should include
specifications of the exact behavior of pathnames for several popular
operating systems, so that multiple implementations for those
operating systems would be compatible with each other. This proposal
does that for alphabetic case.
Some people want the default for :CASE to be :LOCAL instead of :COMMON.
See Rationale.
There should probably be a remark somewhere that says that portable
programs shouldn't expect to be able to create and/or access distinct
files whose pathname components differ only in case.
∂16-Jun-89 2153 X3J13-mailer Issue: PATHNAME-COMPONENT-VALUE (version 3)
Received: from STONY-BROOK.SCRC.Symbolics.COM by SAIL.Stanford.EDU with TCP; 16 Jun 89 21:52:55 PDT
Received: from EUPHRATES.SCRC.Symbolics.COM by STONY-BROOK.SCRC.Symbolics.COM via CHAOS with CHAOS-MAIL id 612480; 17 Jun 89 00:54:54 EDT
Date: Sat, 17 Jun 89 00:55 EDT
From: David A. Moon <Moon@STONY-BROOK.SCRC.Symbolics.COM>
Reply-To: CL-Cleanup@sail.stanford.edu
Subject: Issue: PATHNAME-COMPONENT-VALUE (version 3)
To: X3J13@sail.stanford.edu
Message-ID: <19890617045523.7.MOON@EUPHRATES.SCRC.Symbolics.COM>
Issue: PATHNAME-COMPONENT-VALUE
Related Issues:PATHNAME-CANONICAL-TYPE,
PATHNAME-SUBDIRECTORY-LIST,
PATHNAME-UNSPECIFIC-COMPONENT,
PATHNAME-WILD
References: CLtL pp.410-3
Category: CLARIFICATION and CHANGE
Edit history: Version 1, 20-Mar-89, by Moon
Version 2, 9-May-89, by Moon (rewrite based on mail)
Version 3, 17-Jun-89, by Moon (add discussion, current practice)
Problem description:
CLtL is overly restrictive on the possible values for pathname components.
These restrictions are described in a funny way that makes it unclear
whether they are requirements, guidelines, or just an example.
The restrictions are not all written down in one place, but they appear
to be as follows:
Host nil, :wild, string, or list of strings
Device nil, :wild, string, or something else ("structured")
Directory nil, :wild, string, or something else ("structured")
Name nil, :wild, string, or something else ("structured")
Type nil, :wild, or string
Version nil, :wild, :newest, positive integer, implementation
dependent symbol, or implementation-dependent integer
less than or equal to zero. Suggestions include :oldest,
:previous, :installed, 0, and -1.
PATHNAME-UNSPECIFIC-COMPONENT:NEW-TOKEN allowed implementations to
allow any component to be :UNSPECIFIC. This has been voted in.
PATHNAME-SUBDIRECTORY-LIST proposes a list of strings and keyword
symbols for the directory component.
PATHNAME-CANONICAL-TYPE proposes some new operations but does not
change the possible values of the type component.
PATHNAME-WILD proposes a portable way to test for implementation
dependent component values that indicate wildcard matching. It
does not change the possible values of any component.
Proposal (PATHNAME-COMPONENT-VALUE:SPECIFY):
The points of the proposal have been numbered/lettered to facilitate
discussion of individual points.
0. Pathname component value strings never contain the punctuation
characters that are used to separate pathname fields (e.g. slashes and
dots in Unix). Punctuation characters appear only in namestrings.
Characters used as punctuation can appear in pathname component values
with a non-punctuation meaning if the file system allows it (e.g. a Unix
file name that begins with a dot).
When examining pathname components, conforming programs must be prepared
to encounter any of the following values:
1. Any component can be NIL, which means the component has not
been specified.
2. Any component can be :UNSPECIFIC, which means the component has
no meaning in this particular pathname.
3. The device, directory, name, and type can be strings.
4. The host can be any object, at the discretion of the implementation.
5. The directory can be a list of strings and symbols as detailed in
PATHNAME-SUBDIRECTORY-LIST (this assumes that it passes.)
6. The version can be any symbol or any integer. The symbol :NEWEST
refers to the largest version number that already exists in the file
system when reading, overwriting, appending, superseding, or directory
listing an existing file, and refers to the smallest version number
greater than any existing version number when creating a new file.
Other symbols and integers have implementation-defined meaning.
It is suggested, but not required, that implementations use positive
integers starting at 1 as version numbers, recognize the symbol :OLDEST
to designate the smallest existing version number, and use keyword
symbols for other special versions.
Wildcard pathnames can be used with DIRECTORY but not with OPEN, and
return true from WILD-PATHNAME-P (if issue PATHNAME-WILD passes). When
examining wildcard components of a wildcard pathname, conforming programs
must be prepared to encounter any of the following additional values
in any component or any element of a list that is the directory component:
7. :WILD, which matches anything.
8. A string containing implementation-dependent special wildcard
characters.
9. Any object, representing an implementation-dependent wildcard
pattern.
When constructing a pathname from components, conforming programs
must follow these rules:
a. Any component can be NIL. NIL in the host may mean a default host
rather than an actual NIL in some implementations.
b. The host, device, directory, name, and type can be strings. There
are implementation-dependent limits on the number and type of
characters in these strings.
c. The directory can be a list of strings and symbols as detailed in
PATHNAME-SUBDIRECTORY-LIST (this assumes that it passes.) There are
implementation-dependent limits on the list's length and contents.
d. The version can be :NEWEST.
e. Any component can be taken from the corresponding component
of another pathname. When the two pathnames are for different
file systems (in implementations that support multiple file
systems), an appropriate translation occurs. If no meaningful
translation is possible, an error is signalled. The definitions
of "appropriate" and "meaningful" are implementation-dependent.
f. When constructing a wildcard pathname, the name, type, or version
can be :WILD, which matches anything.
g. An implementation might support other values for some components,
but a portable program cannot use those values. A conforming program
can use implementation-dependent values but this can make it
non-portable, for example, it might work only with Unix file systems.
Consequences:
The changes relative to CLtL plus PATHNAME-UNSPECIFIC-COMPONENT
are as follows:
The removal of punctuation characters during parsing is specified.
"Structured" components are disallowed in non-wildcard pathnames,
except for the specific structuring of directories specified
in issue PATHNAME-SUBDIRECTORY-LIST.
"Structured" hosts are allowed, a generalization of CLtL's list
of strings.
The type and version can be "structured" in wildcard pathnames.
The difference between what component values a program can depend
on being able to use, versus what component values a program must
be prepared to encounter, is clarified.
The implementation-dependent variations are identified explicitly.
Rationale:
This should make it easier to write portable programs that deal with
pathnames and make it easier for implementors by clarifying the
framework into which they must fit. Also it should make it easier
to write the Common Lisp language specification by resolving some
things that were unclear about the status quo.
Adding "structured" hosts conforms to current practice.
Substituting a default host for NIL conforms to current practice
in implementations that require all pathnames to have a specific host.
Confining "structured" devices and names to wildcard pathnames, and
replacing "structured" directories with an explicit specification of
the form of the directory value, should improve portability without causing
any harm.
:WILD is only required to be supported in the name, type, or version,
which are the easiest to implement and the most useful in applications.
Current practice:
All versions of Symbolics Genera violate CLtL in the matter of hosts,
since it uses standard-objects as the host component. Genera deviates
slightly from PATHNAME-SUBDIRECTORY-LIST, but otherwise conforms to
PATHNAME-COMPONENT-VALUE:SPECIFY.
Like Genera, the Explorer current practice is to use an object instead of
a string for the host component. The directory component is a list of
strings, not yet supporting the symbols specified in proposal
PATHNAME-SUBDIRECTORY-LIST; other than that, the Explorer conforms to
proposal PATHNAME-COMPONENT-VALUE:SPECIFY.
Macintosh Allegro Common Lisp 1.2.2 uses NIL and "" for :UNSPECIFIC,
and uses a string with punctuation characters instead of a list for
the directory. MAKE-PATHNAME won't set a component to NIL when
:DEFAULTS is used, it merges with the defaults instead.
Otherwise it seems consistent with what is proposed.
Lucid Common Lisp 3.0.1 under Unix uses NIL for :UNSPECIFIC, and uses
a list for directories of somewhat different form from what is proposed
in PATHNAME-SUBDIRECTORY-LIST. Lucid lets you store arbitrary information
in the version field with MAKE-PATHNAME :VERSION and will return it with
PATHNAME-VERSION (as long as it's a symbol or an integer), even though
it's not used. Otherwise it seems consistent with what is proposed.
Ibuki Common Lisp Release 01/01 behaves the same as Lucid, including the
same form of structured directory, except it doesn't have the ability to
store information in the unused pathname version field, and it has the
same bug in MAKE-PATHNAME that the Macintosh has. Otherwise it seems
consistent with what is proposed.
Other implementations were not surveyed.
This proposal assumes that no current or planned implementation
uses "structured" names except possibly for wildcards.
Cost to Implementors:
Most implementations already conform, except for the changes required
by PATHNAME-UNSPECIFIC-COMPONENT and PATHNAME-SUBDIRECTORY-LIST, so
the cost of this proposal itself should be minimal. It is conceivable
that an implementation may exist that has to change its pathname
representation, for example one that uses numbers as "structured" devices.
Some implementations may have to change their treatment of punctuation
characters.
Cost to Users:
None.
Cost of non-adoption:
Pathnames will continue to be a poorly specified part of the language.
Performance impact:
None of any significance.
Benefits/Esthetics:
The boundary between the specified behavior of pathnames and the
implementation-dependent behavior of pathnames will be more clear.
Discussion:
Sandra Loosemore comments:
As I've said before, I don't think that trying to construct or pick
apart pathnames by component can be accomplished portably in any case,
because even if you restrict the representation of what can appear in
the various components, the objects you stuff in may or may not make
sense for a particular file system. Instead, I would much prefer to
deprecate MAKE-PATHNAME and the PATHNAME-xxx accessors and leave the
question of representation of components unspecified in the standard.
I realize that this position may be seen as being too extreme. In
that case I'd be willing to shut up and go along with proposal SPECIFY
as long as my position gets noted in the writeup.
Larry Masinter and Dave Moon both feel that we should be able to
prescribe exact pathname component values for popular file systems, so
that multiple implementations will behave the same way when using the
same file system. Obvious candidates as the key file systems are MS/DOS,
Macintosh, Unix, and VAX/VMS. A call for volunteers to write up tables
for any of them produced absolutely no response, however.
∂23-Mar-89 1503 X3J13-mailer **DRAFT** Issue: PATHNAME-EXTENSIONS (Version 1)
Received: from STONY-BROOK.SCRC.Symbolics.COM (SCRC-STONY-BROOK.ARPA) by SAIL.Stanford.EDU with TCP; 23 Mar 89 11:47:13 PST
Received: from BOBOLINK.SCRC.Symbolics.COM by STONY-BROOK.SCRC.Symbolics.COM via CHAOS with CHAOS-MAIL id 563798; Thu 23-Mar-89 14:47:01 EST
Date: Thu, 23 Mar 89 14:46 EST
From: Kent M Pitman <KMP@STONY-BROOK.SCRC.Symbolics.COM>
Subject: **DRAFT** Issue: PATHNAME-EXTENSIONS (Version 1)
To: X3J13@SAIL.Stanford.EDU
Message-ID: <890323144641.9.KMP@BOBOLINK.SCRC.Symbolics.COM>
>>> PLEASE DO -NOT- REPLY TO THIS MESSAGE <<<
It's probably so late that no one will read what you have to say
anyway. Ponder it and bring your comments to the meeting.
Summary of debate on CL-Cleanup follows at end.
-kmp
-----
Issue: PATHNAME-EXTENSIONS
Forum: Cleanup
References: Pathnames (pp410-413)
Category: ADDITION
Edit history: 28-Dec-88, Version 1 by Pitman
Status: For Internal Discussion
Problem Description:
CLtL is quite strict about what may and may not be in any kind of
pathname, leaving implementors up against a brick wall when an
idiosyncratic extension is necessary to uniquely and usefully
represent all files in a particular file system which may not have
been completely anticipated by the Common Lisp designers.
Proposal (PATHNAME-EXTENSIONS:NEW-PREDICATE):
Introduce a function COMMON-PATHNAME-P, described as follows:
COMMON-PATHNAME-P pathname [Function]
Returns true if its argument satisfies the Common Lisp
pathname model, and false otherwise. If the argument is
not a pathname, an error of type TYPE-ERROR is signalled.
Clarify that COMMON-PATHNAME-P considers a pathname's host field
to fit the Common Lisp pathname model if the filler of the host
field is a string (naming a host), and not otherwise.
Clarify that COMMON-PATHNAME-P considers a pathname's device to fit
the Common Lisp pathname model if it is a string naming a device,
or NIL, or :WILD[, or, if issue PATHNAME-COMPONENT-UNSPECIFIC
passes, is :UNSPECIFIC], and not otherwise.
Clarify that COMMON-PATHNAME-P considers a pathname's directory
field to fit the Common Lisp pathname model if the filler of the
directory field is NIL, or :WILD, or a string[, or, if issue
PATHNAME-SUBDIRECTORY-LIST passes, is a list described as valid
by that proposal][, or, if issue PATHNAME-COMPONENT-UNSPECIFIC
passes, is :UNSPECIFIC], and not otherwise.
Clarify that COMMON-PATHNAME-P considers a pathname's name to
fit the Common Lisp pathname model if it is a string, or NIL,
or :WILD, and not otherwise.
Clarify that COMMON-PATHNAME-P considers a pathname's type to
fit the Common Lisp pathname model if it is a string, or :WILD,
or NIL[, or, if issue PATHNAME-COMPONENT-UNSPECIFIC passes, is
:UNSPECIFIC], and not otherwise.
Clarify that COMMON-PATHNAME-P considers a pathname's version to
fit the Common Lisp pathname model if it is a positive integer,
:WILD, or NIL, or :NEWEST[, or, if issue PATHNAME-COMPONENT-UNSPECIFIC
passes, is :UNSPECIFIC], and not otherwise.
Clarify that COMMON-PATHNAME-P considers a pathname to be outside
the Common Lisp model if it contains special syntax or purpose
which is not readily apparent to Common Lisp programs. For example,
if a character like "*" or "~" has special meaning to the file
system, then strings like "F*X" or "~FOO" which exploit that syntax
are not considered to "fit the model". [Note that if issue
PATHNAME-WILD passes, WILD-PATHNAME-P might still be true of
some pathnames that were not COMMON-PATHNAME-P.]
Test Case:
;; On Unix...
(common-pathname-p (make-pathname :name "f*x"))
=> nil
;; On Tops-20...
(common-pathname-p (make-pathname :name "FOO" :version -1))
;; On VMS...
(common-pathname-p (parse-namestring "x::y::z::w::[joe]a.b"))
=> nil
;; Normally
(common-pathname-p (make-pathname :name "FOO" :version :wild))
=> t
(common-pathname-p (make-pathname :name "FOO" :version 17))
=> t
Rationale:
The purpose of COMMON-PATHNAME-P is not to detect pathnames which
are not valid. Indeed, no Common Lisp function requires that its
argument satisfy this test; it is assumed that functions such as
OPEN and MERGE-PATHNAMES will recognize and deal appropriately with
whatever special pathname syntax is appropriate to the host operating
system. Rather, the purpose of COMMON-PATHNAME-P is to help Common
Lisp programs which try to pick apart a pathname and perform some
sort of simulated merging on the basis of the simple pathname model
put forth by Common Lisp, so that such programs can detect situations
which are beyond their capabilities.
Current Practice:
Probably nobody implements this.
Cost to Implementors:
Small. The program is fairly straightforward. It could almost be
written as a portable library if it weren't for detecting special
characters that have some special syntax.
Cost to Users:
None. This change is upward compatible.
Cost of Non-Adoption:
Some idiosyncratic system syntax would be hard to detect.
Making extensions to the pathname system in a way that Common Lisp
users would not be forced to trip over would be more difficult.
Benefits:
Some ad-hoc user code which tries to do the same thing could be
eliminated. Portable programs which must prompt for native pathname
syntax, and deal with the result of having parsed it could be more
robust.
Making idiosyncratic extensions to the pathname system would be much
less prone to cause problem for portable programs which used this
facility.
The presence of this operator could someday ease the transition
into a future, incompatible pathname system.
Aesthetics:
Probably improves aesthetics slightly by giving people who want to
reject extended pathnames a more reliable way of weeding them out.
Discussion:
The COMMON data type was probably intended to have this same purpose.
Unfortunately, since no one ever really said specifically enough what
was in COMMON or not, and why, it never really caught on. Hopefully
this proposal is definite enough on such issues to not be useless.
Pitman thinks this is probably a good idea.
------- Summary of debate -------
Discussion on CL-Cleanup centered around two issues:
- Is this really needed? What could it be used for?
I suggested following program as an illustration:
(DEFUN TRANSLATE-LOGICAL-PATHNAME (LPATHNAME)
(MULTIPLE-VALUE-BIND (LHOST LDEVICE LDIR LNAME LTYPE LVERSION)
(PARSE-LOGICAL-PATHNAME LPATHNAME)
(MULTIPLE-VALUE-BIND (PHOST PDEVICE PDIR PNAME PTYPE PVERSION)
(TRANSLATE-PATHNAME-COMPONENTS LHOST LDEVICE LDIR LNAME LTYPE LVERSION)
(LET ((PHYSICAL-PATHNAME (MAKE-PATHNAME :HOST PHOST ...)))
(UNLESS (COMMON-PATHNAME-P PHYSICAL-PATHNAME)
(CERROR "Use ~*~A anyway."
"The result of translating pathname ~A to a physical pathname~
~%resulted in a valid physical pathname, ~A,~
~%but that pathname has special meaning to host ~A which may~
~%not have been what was intended."
LPATHNAME PHYSICAL-PATHNAME PHOST))))))
Also, recently there has been concern (e.g., in issue
PATHNAME-SUBDIRECTORY-LIST) about requirements for conformance
precluding interesting extensions that particular implementations
might want to experiment with. This would provide a way for portable
programs to guard against such `creative' extensions.
- Isn't this something users could write?
The answer is no. What is a non-portable pathname cannot be portably
detected. e.g., the fact that "*" or "~" or "{" or whatever is magic
in some filename syntax and not in another is (almost by definition) not
something that is portably detectable. Portable programs can just decide
to limit themselves to the least common denominator (e.g., refusing to let
you type in any pathname to a prompt for pathname if it has an `scary
looking' character in it), but this provides a way of being both a little
more robust and a little more tolerant.
For those who are curious, I'm not adamant about this proposal. I just
want it to be available as an option in case it eases the discussion on
other issues.
-kmp
∂21-Jun-89 1507 X3J13-mailer Issue: PATHNAME-LOGICAL (version 3)
Received: from STONY-BROOK.SCRC.Symbolics.COM by SAIL.Stanford.EDU with TCP; 21 Jun 89 15:06:31 PDT
Received: from KENNETH-WILLIAMS.SCRC.Symbolics.COM by STONY-BROOK.SCRC.Symbolics.COM via INTERNET with SMTP id 614486; 21 Jun 89 12:20:55 EDT
Date: Wed, 21 Jun 89 12:19 EDT
From: David A. Moon <Moon@STONY-BROOK.SCRC.Symbolics.COM>
Reply-To: CL-Cleanup@sail.stanford.edu
Subject: Issue: PATHNAME-LOGICAL (version 3)
To: X3J13@sail.stanford.edu
Message-ID: <19890621161905.6.MOON@KENNETH-WILLIAMS.SCRC.Symbolics.COM>
This is an old issue but it has not been written up before. I'm sorry this
is coming out so late, but it's been quite a struggle to cast it into clear
and easily understood form. It is rather a long proposal in order to be
very clear about exactly what it's proposing, and I apologize for the
length.
Issue: PATHNAME-LOGICAL
Forum: Cleanup
References: Pathnames (pp410-413)
OPEN (p.418), WITH-OPEN-FILE (p.422), RENAME-FILE (p.423),
DELETE-FILE (p.424), PROBE-FILE (p.424),
FILE-WRITE-DATE (p.424), FILE-AUTHOR (p.424), LOAD (p.426),
COMPILE-FILE (p.439), DIRECTORY (p.427), PATHNAME (p.413),
TRUENAME (p.413), MERGE-PATHNAMES (p.415),
MAKE-PATHNAME (p.416), and PARSE-NAMESTRING (p.414).
Related issues: PATHNAME-CANONICAL-TYPE, PATHNAME-COMPONENT-VALUES,
PATHNAME-SUBDIRECTORY-LIST, and PATHNAME-WILD
Category: ADDITION
Edit history: Version 1, 11-May-89, by Moon
Version 2, 18-May-89, by Moon
Version 3, 21-Jun-89, by Moon (revise based on discussion
in the cleanup committee)
Problem description:
Pathname values are not portable, but they are sometimes part of a
program, for example the names of files containing the program and the
data used by the program. Moving large programs between sites would
be easier if pathname values did not have to be translated.
Pathname values are nonportable because not all Common Lisp
implementations use the same operating system and file name syntax varies
widely among operating systems. In addition, corresponding files at two
different sites may have different names even when the operating system
is the same; for example, they may be on different directories or
different devices.
The issue of portable pathname values is separate from the issues of
portable pathname operations. See the related issues listed above.
For inter-issue interactions, see the discussion section below.
Note that issue PATHNAME-LOGICAL fundamentally depends on issue
PATHNAME-WILD. If PATHNAME-WILD:NEW-FUNCTIONS does not pass,
PATHNAME-LOGICAL cannot pass.
Proposal (PATHNAME-LOGICAL:ADD):
1. Define a "logical" file system that looks the same at every site.
This file system is implemented by translating each logical pathname into
a physical pathname on a real file system. The logical pathnames are the
same at all sites, but the translations are different at each site, thus
the physical pathnames can be different at each site.
2a. The syntax of a logical pathname namestring is as follows:
[ host ":" ] [ ";" ] { directory ";" }* [ name ] [ "." type [ "." version ]]
2b. Terminology:
A <word> consists of one or more uppercase letters, digits, and hyphens.
A <wildcard word> consists of one or more asterisks, uppercase letters,
digits, and hyphens, including at least one asterisk, with no two
asterisks adjacent. Each asterisk matches a sequence of zero or more
characters. The <wildcard word> "*" parses into :WILD, the others parse
into strings.
In <words> and <wildcard words> lowercase letters are translated to
uppercase. The consequences of using other characters are unspecified.
2c. Logical pathname components:
The host is a <word> that has been defined as a logical pathname host by
using SETF of LOGICAL-PATHNAME-TRANSLATIONS.
There is no device, so the device component of a logical pathname is
always :UNSPECIFIC. No other component can be :UNSPECIFIC.
Each directory is a <word>, a <wildcard word>, or "**" (:WILD-INFERIORS).
If a semicolon precedes the directories, the directory component is
relative, otherwise it is absolute.
The name is a <word> or a <wildcard word>.
The type is a <word> or a <wildcard word>.
The version is a positive decimal integer or "NEWEST" (:NEWEST) or "*"
(:WILD). The letters in "NEWEST" can be in either alphabetic case.
The consequences of using any value not specified here as a logical
pathname component are unspecified.
The null string "" is not a valid value for any component of a logical
pathname, since "" is not a <word> and not a <wildcard word>.
3. Parsing of logical pathname namestrings into logical pathnames
operates as follows:
3a. Logical pathname namestrings are recognized by the LOGICAL-PATHNAME
and TRANSLATE-LOGICAL-PATHNAME functions. In this case the host portion
of the logical pathname namestring and its following colon are required.
3b. The PARSE-NAMESTRING function recognizes a logical pathname
namestring when the host argument is logical or the defaults argument is
a logical pathname. In this case the host portion of the logical
pathname namestring and its following colon are optional. If the host
portion of the namestring and the host argument are both present and do
not match, an error is signalled.
The host argument is logical if it is supplied and came from
PATHNAME-HOST of a logical pathname. Whether a host argument is logical
if it is a string equal to a logical pathname host name is
implementation-defined.
3c. The MERGE-PATHNAMES function recognizes a logical pathname namestring
when the defaults argument is a logical pathname. In this case the host
portion of the logical pathname namestring and its following colon are
optional.
3d. Whether the other functions that coerce strings to pathnames
(PATHNAME, TRUENAME, PARSE-NAMESTRING in other circumstances than those
described in point 3b, MERGE-PATHNAMES in other circumstances than those
described in point 3c, the :DEFAULTS argument to MAKE-PATHNAME,
PATHNAME-HOST, PATHNAME-DEVICE, PATHNAME-DIRECTORY, PATHNAME-NAME,
PATHNAME-TYPE, PATHNAME-VERSION, NAMESTRING, FILE-NAMESTRING,
DIRECTORY-NAMESTRING, HOST-NAMESTRING, ENOUGH-NAMESTRING, OPEN,
WITH-OPEN-FILE, RENAME-FILE, DELETE-FILE, PROBE-FILE, FILE-WRITE-DATE,
FILE-AUTHOR, LOAD, DIRECTORY, COMPILE-FILE, ED, DRIBBLE, WILD-PATHNAME-P,
PATHNAME-MATCH-P, TRANSLATE-PATHNAME, and COMPILE-FILE-PATHNAME)
recognize logical pathname namestrings is implementation defined.
4. Some real file systems do not have versions. Logical pathname
translation to such a file system ignores the version. This implies that
a program cannot rely on being able to store more than one version of a
file named by a logical pathname.
5. The type of a logical pathname for a Common Lisp source file is "LISP".
This should be translated into whatever type is appropriate in a physical
pathname.
6. The logical pathname host name "SYS" is reserved for the implementation.
The existence and meaning of SYS: logical pathnames is
implementation-defined.
7. File manipulation functions operate with logical pathnames as follows:
7a. The functions OPEN (and WITH-OPEN-FILE), RENAME-FILE, DELETE-FILE,
PROBE-FILE, FILE-WRITE-DATE, FILE-AUTHOR, LOAD, DIRECTORY, COMPILE-FILE,
ED, DRIBBLE, COMPILE-FILE-PATHNAME, and TRUENAME accept logical pathnames
and translate them into physical pathnames, as if by calling the
TRANSLATE-LOGICAL-PATHNAME function.
7b. PATHNAME of a stream created by OPEN (or WITH-OPEN-FILE) of a logical
pathname is a logical pathname.
7c. TRUENAME, PROBE-FILE, and DIRECTORY never return logical pathnames.
7d. RENAME-FILE with a logical pathname as the second argument returns a
logical pathname as the first value.
7e. MERGE-PATHNAMES returns a logical pathname if and only if its first
argument is a logical pathname or its first argument does not specify a
host and the default is a logical pathname.
7f. MAKE-PATHNAME returns a logical pathname if and only if the host is
logical. If the :host argument to MAKE-PATHNAME is supplied, the host is
logical if it came from PATHNAME-HOST of a logical pathname. Whether a
:host argument is logical if it is a string equal to a logical pathname
host name is implementation-defined.
7g. PARSE-NAMESTRING returns a logical pathname according to points 3b
and 3d.
Add these defined names to Common Lisp in support of logical pathnames:
8. LOGICAL-PATHNAME [Class]
LOGICAL-PATHNAME is a subclass of PATHNAME.
9. LOGICAL-PATHNAME pathname [Function]
Converts the argument to a logical pathname and returns it. The
argument can be a logical pathname, a logical pathname namestring
containing a host component, or a stream for which the PATHNAME
function returns a logical pathname. For any other argument,
LOGICAL-PATHNAME signals an error of type TYPE-ERROR.
10. TRANSLATE-LOGICAL-PATHNAME pathname [Function]
Translates a logical pathname to the corresponding physical pathname.
The pathname argument is first coerced to a pathname. If it is not a
pathname, string, or file stream an error of type TYPE-ERROR is
signalled.
If the coerced argument is a logical pathname, the first matching
translation (according to PATHNAME-MATCH-P) of the logical pathname
host is applied, using TRANSLATE-PATHNAME with the reversible argument
true. If the result is a logical pathname, this process is repeated.
Three values are returned:
1. The physical pathname
2. The from-wildcard of the translation
3. The to-wildcard of the translation
If no translation matches, an error of type FILE-ERROR is signalled.
If the coerced argument is a physical pathname, it is returned as all
three values.
All three values are pathnames. The second and third values might not
come directly from a logical pathname translation list; they might be
modified to reflect multiple levels of translation and/or additional
translations, typically to provide translation of file types to local
naming conventions, to accomodate physical file systems with limited
length names, or to deal with special character requirements such as
translating hyphens to underscores or uppercase letters to lowercase.
Any such additional translations are implementation defined. Some
implementations do no additional translations.
Except when an error is signalled, TRANSLATE-LOGICAL-PATHNAME satisfies
this identity:
(MULTIPLE-VALUE-BIND (PHYSICAL FROM-WILDCARD TO-WILDCARD)
(TRANSLATE-LOGICAL-PATHNAME LOGICAL)
(AND (EQUAL (TRANSLATE-PATHNAME LOGICAL FROM-WILDCARD TO-WILDCARD T)
PHYSICAL)
(EQUAL (TRANSLATE-PATHNAME PHYSICAL TO-WILDCARD FROM-WILDCARD T)
LOGICAL)))
The above is written assuming the LOGICAL argument has already been
coerced to a pathname.
11. LOGICAL-PATHNAME-TRANSLATIONS host [Function]
If <host> is not the host component of a logical pathname and not a
string that has been defined as a logical pathname host name by SETF of
LOGICAL-PATHNAME-TRANSLATIONS, signals an error of type TYPE-ERROR.
Otherwise returns the host's list of translations. Each translation is
a list of at least two elements: from-wildcard and to-wildcard. Any
additional elements are implementation defined. From-wildcard is a
logical pathname whose host is <host>. To-wildcard is a pathname.
Translations are searched in the order listed, so more specific
from-wildcards must precede more general ones.
(SETF (LOGICAL-PATHNAME-TRANSLATIONS host) translations) sets a logical
pathname host's list of translations. If <host> is a string that has
not been previously used as logical pathname host, a new logical
pathname host is defined, otherwise an existing host's translations are
replaced. Logical pathname host names are compared with STRING-EQUAL.
When setting the translations list, each from-wildcard can be a logical
pathname whose host is <host> or a logical pathname namestring
parseable by (PARSE-NAMESTRING string host). Each to-wildcard can be
anything coercible to a pathname by (PATHNAME to-wildcard). If
to-wildcard coerces to a logical pathname, TRANSLATE-LOGICAL-PATHNAME
will perform repeated translation steps when it uses it. There can
be implementation-defined restrictions against logical to-wildcards
that would produce non-reversible translations.
Implementations can define additional functions that operate on
logical pathname hosts.
12. LOAD-LOGICAL-PATHNAME-TRANSLATIONS host [Function]
If a logical pathname host named <host> (a string) is already defined,
return NIL. Otherwise, search for a logical pathname host definition
in an implementation defined manner. If none is found, signal an
error. If a definition is found, install it and return T.
The search used by LOAD-LOGICAL-PATHNAME-TRANSLATIONS should be
documented, as logical pathname definitions will be created by users,
not only by Lisp implementors. A typical search technique is to
look in a certain directory for a file whose name is derived from
the host name in an implementation-defined fashion.
13. COMPILE-FILE-PATHNAME pathname &key :output-file [Function]
Returns the pathname that COMPILE-FILE would write into, if given the
same arguments. If the pathname argument is a logical pathname and the
:output-file argument is unspecified, the result is a logical pathname.
If an implementation supports additional keyword arguments to
COMPILE-FILE, COMPILE-FILE-PATHNAME must accept the same arguments.
Examples:
;This function is like DIRECTORY, but if its argument is a logical
;pathname it returns logical pathnames in the results. If its
;argument is a physical pathname, it is the same as DIRECTORY.
(defun logical-directory (pathname)
(multiple-value-bind (physical from-wildcard to-wildcard)
(translate-logical-pathname pathname)
(map 'list #'(lambda (truename)
(translate-pathname truename to-wildcard
from-wildcard t))
(directory physical))))
;A very simple example of setting up a logical pathname host. No
;translations are necessary to get around file system restrictions, so
;all that is necessary is to specify the root of the physical directory
;tree that contains the logical file system.
;The namestring syntax on the right-hand side is implementation-specific.
(setf (logical-pathname-translations "foo")
'(("**;*.*.*" "MY-LISPM:>library>foo>**>")))
;Sample use of that logical pathname. All return values
;are of course implementation-specific.
(translate-logical-pathname "foo:bar;baz;mum.quux.3")
=> MY-LISPM:>library>foo>bar>baz>mum.quux.3,
foo:**;*.*.*,
MY-LISPM:>library>foo>**>*.*.*
;A more complex example, dividing the files among two file servers
;and several different directories. This Unix doesn't support
;:WILD-INFERIORS in the directory, so each directory level must
;be translated individually. No file name or type translations
;are required except for .MAIL to .MBX.
;The namestring syntax on the right-hand side is implementation-specific.
(setf (logical-pathname-translations "prog")
'(("RELEASED;*.*.*" "MY-UNIX:/sys/bin/my-prog/")
("RELEASED;*;*.*.*" "MY-UNIX:/sys/bin/my-prog/*/")
("EXPERIMENTAL;*.*.*" "MY-UNIX:/usr/Joe/development/prog/")
("EXPERIMENTAL;DOCUMENTATION;*.*.*"
"MY-VAX:SYS$DISK:[JOE.DOC]")
("EXPERIMENTAL;*;*.*.*" "MY-UNIX:/usr/Joe/development/prog/*/")
("MAIL;**;*.MAIL" "MY-VAX:SYS$DISK:[JOE.MAIL.PROG...]*.MBX")))
;Sample use of that logical pathname. All return values
;are of course implementation-specific.
(translate-logical-pathname "prog:mail;save;ideas.mail.3")
=> MY-VAX:SYS$DISK:[JOE.MAIL.PROG.SAVE]IDEAS.MBX.3,
PROG:MAIL;**;*.MAIL.*,
MY-VAX:SYS$DISK:[JOE.MAIL.PROG...]*.MBX.*
;Example translations for a program that uses three files main.lisp,
;auxiliary.lisp, and documentation.lisp. These translations might be
;supplied by a software supplier as examples.
;For Unix with long file names
(setf (logical-pathname-translations "prog")
'(("CODE;*.*.*" "/lib/prog/")))
;Sample use of that logical pathname. All return values
;are of course implementation-specific.
(translate-logical-pathname "prog:code;documentation.lisp")
=> /lib/prog/documentation.lisp,
PROG:CODE;*.*.*,
/lib/prog/*
;For Unix with 14-character file names, using .lisp as the type
(setf (logical-pathname-translations "prog")
'(("CODE;DOCUMENTATION.*.*" "/lib/prog/docum.*")
("CODE;*.*.*" "/lib/prog/")))
;Sample use of that logical pathname. All return values
;are of course implementation-specific.
(translate-logical-pathname "prog:code;documentation.lisp")
=> /lib/prog/docum.lisp,
PROG:CODE;DOCUMENTATION.*.*,
/lib/prog/docum.*
;For Unix with 14-character file names, using .l as the type
;The second translation shortens the compiled file type to .b
(setf (logical-pathname-translations "prog")
`(("**;*.LISP.*" ,(logical-pathname "PROG:**;*.L.*"))
(,(compile-file-pathname (logical-pathname "PROG:**;*.LISP.*"))
,(logical-pathname "PROG:**;*.B.*"))
("CODE;DOCUMENTATION.*.*" "/lib/prog/documentatio.*")
("CODE;*.*.*" "/lib/prog/")))
;Sample use of that logical pathname. All return values
;are of course implementation-specific.
(translate-logical-pathname "prog:code;documentation.lisp")
=> /lib/prog/documentatio.l,
PROG:CODE;DOCUMENTATION.LISP.*,
/lib/prog/documentatio.l
;For a Cray with 6 character names and no directories, types, or versions.
(setf (logical-pathname-translations "prog")
(let ((l '(("MAIN" "PGMN")
("AUXILIARY" "PGAUX")
("DOCUMENTATION" "PGDOC")))
(logpath (logical-pathname "prog:code;"))
(phypath (pathname "XXX")))
(append
;; Translations for source files
(mapcar #'(lambda (x)
(let ((log (first x))
(phy (second x)))
(list (make-pathname :name log
:type "LISP"
:version :wild
:defaults logpath)
(make-pathname :name phy
:defaults phypath))))
l)
;; Translations for compiled files
(mapcar #'(lambda (x)
(let* ((log (first x))
(phy (second x))
(com (compile-file-pathname
(make-pathname :name log
:type "LISP"
:version :wild
:defaults logpath))))
(setq phy (concatenate 'string phy "B"))
(list com
(make-pathname :name phy
:defaults phypath))))
l))))
;Sample use of that logical pathname. All return values
;are of course implementation-specific.
(translate-logical-pathname "prog:code;documentation.lisp")
=> PGDOC,
PROG:CODE;DOCUMENTATION.LISP.*,
PGDOC
Rationale:
1. Large programs can be moved between sites without changing any
pathnames, provided all pathnames used are logical. A portable system
construction tool can be created that operates on programs defined as
sets of files named by logical pathnames.
2. Logical pathname syntax was chosen to be easily translated into most
popular file systems, while still being powerful enough for storing large
programs. Although they have hierarchical directories, extended wildcard
matching, versions, and no limit on the length of names, they can be
mapped onto a less capable real file file system by translating each
directory that is used into a flat directory name, doing wildcards in
Lisp rather than in the file system, treating all versions as :newest,
and/or using translations to shorten long names.
Logical pathname words are restricted to non-case-sensitive letters,
digits, and hyphens to avoid creating problems with real file systems
that support limited character sets for file naming. Other characters
could have been mapped onto such file systems through translations, but
that didn't seem worth the trouble. Logical pathnames have to be
non-case-sensitive or it would be very difficult to map them onto a
non-case-sensitive file system.
Features such as :UP and :BACK relative directories and a namestring
syntax for the root directory were not felt to be necessary in logical
pathnames. They could be added later if a need emerges.
It is not a goal of logical pathnames to be able to represent all
possible file names. Their goal is rather to represent just enough file
names to be useful for storing software. Real pathnames, in contrast,
need to provide a uniform interface to all possible file names, including
names and naming conventions that are not under the control of Common
Lisp.
The choice of logical pathname syntax, using colon, semicolon, and
period, was guided by the goals of being visually distinct from real file
systems and minimizing the use of special characters.
The consequences of using any value not specified here as a logical
pathname component are unspecified, for the benefit of the Explorer.
3. The LOGICAL-PATHNAME function is separate from the PATHNAME function
so that the syntax of logical pathname namestrings does not constrain the
syntax of physical pathname namestrings in any way. Logical pathname
syntax must be defined by Common Lisp so that logical pathnames can be
conveniently exchanged between implementations, but physical pathname
syntax is dictated by forces outside our control.
3b,c. Allowing PARSE-NAMESTRING and MERGE-PATHNAMES to recognize logical
pathname namestrings in these situations provides for natural operations
on logical pathnames. Frequently a string containing just a name, or a
name and a type, will be recognized as a logical pathname by merging it
against a default containing a logical pathname host and directory.
3d. Recognition of logical pathname namestrings by PATHNAME and related
functions is left up to each implementation because some implementations
definitely require this, other implementations don't want to do this, and
nobody wants to change. In any case, Common Lisp historically has avoided
saying anything about the syntax of the strings accepted by the PATHNAME
function, and point 3d preserves that position.
3b,7f. Leaving it implementation defined whether a string, used as the
host argument to PARSE-NAMESTRING or the :host argument to MAKE-PATHNAME,
can be recognized as logical pathname host name is for the same reason as
point 3d. It allows each implementation to decide whether there is one
namespace or two. The correct way to write this is:
(MAKE-PATHNAME :HOST (PATHNAME-HOST (LOGICAL-PATHNAME "hostname:"))
...)
4. Logical pathname versions could have been supported on real file
systems that do not have versions by defining a kind of translation to
encode the version number in the name. However, the typical use of
versions is such that on a file system without versions, people would
rather just store one version of a file, and not preserve the version
information by encoding it somehow in the name. This is different from
the typical use of types or directories, where the files with different
values in those components are truly distinct and everything would break
if you only kept one file.
5,13. The COMPILE-FILE-PATHNAME function and the specification of "LISP"
as the type of a logical pathname for a Common Lisp source file together
provide enough information about compilation for a portable system
construction tool that uses logical pathnames to work. Suppose you want
to call COMPILE-FILE only if the source file is newer than the compiled
file. To do that, you have to have a way to know the name of the
compiled file without actually calling COMPILE-FILE.
No standard file type for compiler output is proposed, because in some
implementations the compiler produces one of several file types,
depending on a variety of implementation-dependent circumstances.
COMPILE-FILE-PATHNAME provides access to the "default[ing] in a manner
appropriate to the implementation's file system conventions" mentioned in
the CLtL documentation of COMPILE-FILE.
6. The use of the logical pathname host name "SYS" for the implementation
is current practice. Standardizing on this name helps users choose
logical pathname host names that avoid conflicting with
implementation-defined names.
7. Accepting logical pathnames for file access is a natural extension
of the file access functions and makes it easier to program using only
logical pathnames in situations where that is appropriate.
8. The LOGICAL-PATHNAME class exists so that methods can distinguish
logical pathnames from regular pathnames.
9. See point 3 above.
10. TRANSLATE-LOGICAL-PATHNAME is the heart of the logical pathname
feature. The two extra values returned by TRANSLATE-LOGICAL-PATHNAME
allow for back-translation, as shown in the LOGICAL-DIRECTORY example.
Allowing TRANSLATE-LOGICAL-PATHNAME on a physical pathname, simply
returning the argument, makes some programs easier to write. Additional
implementation defined translations make it possible for implementations
with unusual file systems to offer some help to the user in setting up
the translations for a logical pathname host, by handling some of the
work automatically. Logical pathnames that translate to other logical
pathnames are a feature that several people have requested.
11. SETF of LOGICAL-PATHNAME-TRANSLATIONS is a simple way for a user to
define a new logical pathname host. Using SETF makes it possible to add
to or modify the translations of an existing logical pathname host. The
restriction against non-reversible translations is necessary because many
logical pathname using programs depend on reversibility, for instance to
translate a truename back into a logical pathname. If logical pathname
translation was not reversible, two different logical pathnames might
translate into the same physical pathname, which could scramble files.
It is always up to the person who writes the translation rules for a
particular logical pathname host to a particular physical file system to
make sure that the logical pathnames that are actually going to be used
translate to valid pathnames for the particular file system.
12. Loading of logical pathname translations from a site-dependent file
allows software to be distributed using logical pathnames. The assumed
model of software distribution is a division of labor between the
supplier of the software and the user installing it. The supplier
chooses logical pathnames to name all the files used or created by the
software, and supplies examples of logical pathname translations for a
few popular file systems. Each example uses an assumed directory and/or
device name, assumes local file naming conventions, and provides
translations that will translate all the logical pathnames used or
generated by the particular software into valid physical pathnames.
For a powerful file system these translations can be quite simple. For
a more restricted file system, it may be necessary to list an explicit
translation for every logical pathname used, for example when dealing
with restrictions on the maximum length of a file name.
The user installing the software decides on which device and/or directory
to store the files and edits the example logical pathname translations
accordingly. If necessary, the user also adjusts the translations for
local file naming conventions and any other special aspects of the user's
local file system policy and local Common Lisp implementation. For
example, the files might be divided among several file server hosts to
share the load. The process of defining site-customized logical pathname
translations is quite easy for a user of a popular file system for which
the software supplier has provided an example. A user of a more unusual
file system might have to take more time; the supplier can help by
providing a list of all the logical pathnames used or generated by the
software.
Once the user has created a suitable SETF of LOGICAL-PATHNAME-TRANSLATIONS
form, he can evaluate that form and then load and run the software. It
may be necessary to use the translations again, or on another workstation
at the same site, so it is best to save the SETF form in the standard
place where it can be found later by LOAD-LOGICAL-PATHNAME-TRANSLATIONS.
Often a software supplier will include a program for restoring software
from the distribution medium to the file system, and a program for loading
the software from the file system into a Common Lisp, and these programs
will start by calling LOAD-LOGICAL-PATHNAME-TRANSLATIONS to make sure that
the logical pathname host is defined.
Note that the SETF of LOGICAL-PATHNAME-TRANSLATIONS form isn't part of
the program, it's separate. It's written by the user, not by the
software supplier. That separation, and a uniform convention for how to
do the separation, are the key aspects of logical pathnames. For small
programs involving only a handful of files, it doesn't matter much. The
real benefits come with large programs with hundreds or thousands of
files and more complicated situations such as program-generated file
names or porting a program developed on a system with long file names
onto a system with a very restrictive limit on the length of file names.
Current practice:
Symbolics Genera has had a similar facility for many years. It is used
extensively for software distribution by Symbolics and its customers.
The Genera facility uses the same logical pathname syntax but different
function names, and is somewhat more complicated. The extra complexity
is not necessary in the Common Lisp standard.
The T.I. Explorer also has a comparable logical pathname facility,
although the translation mechanism is unfortunately less general than
proposed here. The namestring syntax used is slightly different:
host ":" [{directory "."}* directory ";"] [name] ["." type] ["#" version]
The newest version is indicated by ">" instead of "newest".
Macintosh Allegro Common Lisp) has a logical pathname feature which is
somewhat simpler but aimed at solving the same problems. It has logical
directory names, to simplify access to sets of files in differently named
directories (an especially severe problem on micros where everybody just
has to have a different pet name for their hard disk). This isn't really
the same as simplifying access to different file systems, although of
course solving the latter automatically solves the former. In general,
access to different file systems requires translating names and types,
not just translating directories.
Symbolics Genera offers a function for translating from a physical
pathname back to a logical pathname. There are a number of problems with
this, and so it has not been proposed here. Instead
TRANSLATE-LOGICAL-PATHNAME returns enough information to allow the user
program to perform the backtranslation itself.
The Genera equivalent of LOAD-LOGICAL-PATHNAME-TRANSLATIONS looks for
a file named SYS:SITE;hostname.TRANSLATIONS.
Current practice in Genera, Explorer, and Macintosh has one namespace for
both logical and physical namestrings. This proposal allows an
implementation to choose to have one namespace or to have two separate
namespaces for namestrings.
Cost to Implementors:
This is a fairly complex facility, but its performance is unimportant
so a straightforward implementation should suffice. Most of the
complexity comes in dealing with unusual file systems, such as ones
that don't allow long file names.
Cost to Users:
None.
Cost of non-adoption:
Portable software construction and distribution will have to rely on
implementation-dependent kludges. Lisp software will continue to be
difficult to install.
Performance impact:
None.
Benefits:
Avoid cost of non-adoption.
Esthetics:
Improved portability of large programs.
Discussion:
Issue PATHNAME-LOGICAL fundamentally depends on issue PATHNAME-WILD. If
PATHNAME-WILD:NEW-FUNCTIONS does not pass, PATHNAME-LOGICAL cannot pass.
If PATHNAME-CANONICAL-TYPE:NEW-CONCEPT passes, it will affect the
behavior of the function TRANSLATE-PATHNAME and therefore the behavior of
the function TRANSLATE-LOGICAL-PATHNAME. When a logical pathname
translation has from-wildcard and to-wildcard type components that are
:WILD or omitted, translation of the type will be guided by canonical
types. If PATHNAME-CANONICAL-TYPE:NEW-CONCEPT fails to pass, it will
either have to be done behind the scenes by TRANSLATE-PATHNAME or users
will have to write more verbose translations that individually specify
the handling of each file type.
∂23-Mar-89 2059 X3J13-mailer **DRAFT** Issue: PATHNAME-PRINT-READ (Version 1)
Received: from STONY-BROOK.SCRC.Symbolics.COM (SCRC-STONY-BROOK.ARPA) by SAIL.Stanford.EDU with TCP; 23 Mar 89 20:59:42 PST
Received: from BOBOLINK.SCRC.Symbolics.COM by STONY-BROOK.SCRC.Symbolics.COM via INTERNET with SMTP id 563865; 23 Mar 89 15:36:42 EST
Date: Thu, 23 Mar 89 15:36 EST
From: Kent M Pitman <KMP@STONY-BROOK.SCRC.Symbolics.COM>
Subject: **DRAFT** Issue: PATHNAME-PRINT-READ (Version 1)
To: X3J13@SAIL.Stanford.EDU
Message-ID: <890323153624.3.KMP@BOBOLINK.SCRC.Symbolics.COM>
>>> PLEASE DO -NOT- REPLY TO THIS ISSUE <<<
Ponder it and come prepared to discuss it at the meeting.
See summary of Cleanup discussion at end.
-kmp
-----
Issue: PATHNAME-PRINT-READ
References: File System Interface (pp409-427)
Category: CHANGE/ADDITION
Edit history: 21-Oct-88, Version 1 by Pitman
Status: For Internal Discussion
Problem Description:
Although pathnames are required to print re-readably, there is no
standardized representation for pathnames and so no standardized
way in which they should print.
Further, it is common in programs to want pathnames to print in
their file-system specific format.
Proposal (PATHNAME-PRINT-READ:SHARPSIGN-P):
Define the reader syntax #P"..." to be equivalent to
#.(PARSE-NAMESTRING "...").
Define that when *PRINT-ESCAPE* is T, the syntax #P"..." is
how a pathname should be printed by WRITE (and hence by PRIN1,
PRINT, etc.). The "..." is the namestring representation of the
pathname.
Define that when *PRINT-ESCAPE* is NIL, WRITE writes a pathname
object P by writing (NAMESTRING p) instead.
Test Case:
(PARSE-NAMESTRING "foo.lisp")
=> #P"foo.lisp"
(FORMAT NIL "Written to ~A." #P"foo.bin")
=> "Written to foo.bin."
(TYPEP #P"foo.bin" 'PATHNAME)
=> T
Rationale:
This satisfies the stated goals.
[For :ESCAPE T] It will not be possible to make the printed
pathname printed representation totally portable because of
variations in file systems, but for different Common Lisp
implementations on the same file system, or for Common Lisp
systems running on file systems having compatible syntax,
portability would be improved by this specification.
Also, some implementations (eg, Symbolics Genera) use
specialized representations for pathnames on different file
systems. Eg, an MSDOS pathname is of type MSDOS-PATHNAME,
not just type PATHNAME. #S(PATHNAME ...) is not only more
verbose than necessary but might be misleading to some users
because the object created will not have a TYPE-OF PATHNAME.
[For :ESCAPE NIL] Printing the namestring of a pathname is
a common operation and it is convenient to have a shorthand
for doing it. Further, some implementations may be able to
optimize the presentation of a pathname in this mode by
printing it without actually consing the string.
Current Practice:
Symbolics Genera implements the proposed behavior.
Cost to Implementors:
Fairly minor changes to the readtable and the printer.
Cost to Users:
Users who now use the non-portable syntax #S(...) in order
to enter literal pathnames might have to change. [However,
implementations would be free to continue to support this
read syntax for compatibility.]
Cost of Non-Adoption:
Portability of code and data involving pathnames within a
given file system (or between suitably similar file systems)
would be hampered needlessly.
Benefits:
The cost of non-adoption would be avoided.
Aesthetics:
The #P syntax is pretty and hides unimportant details.
Discussion:
Pitman supports this change.
-----
Summary of discussion on CL-Cleanup:
EB noted that Lucid CL implements the proposed behavior and that there
is cost to users who define their own #P read macro. He weakly supports
the proposal but wishes someone had pursued a `generic pathnames' proposal.
Pierson noted that KCL uses #"...", but that this collides with proposed
syntax for Dick Waters' pretty printer. He also thinks #P is better
because it is already more widely used for that purpose.
Masinter noted that Envos Medley prints pathnames with the syntax
#.(pathname "asdf"), which he thinks is not as pretty as #P"asdf"
but currently more portable.
KMP and JonL raised the issues that #. has the disadvantage that it must
be parsed by the full Lisp engine, while #P can be parsed by something
simpler. Permitting #. leaves a gaping hole for trojan horses, and
also requires the presence of the evaluator in a delivery system.
MLY, GSB, Peirson, and IIM argued for not using up an extra dispatch
character.
MLY suggested #S(PATHNAME namestring [optional-host]).
IIM noted they use #.(PATHNAME namestring host) because different file
systems have different parsing conventions.
∂16-Jun-89 2225 X3J13-mailer Issue: PATHNAME-SUBDIRECTORY-LIST (version 7)
Received: from STONY-BROOK.SCRC.Symbolics.COM by SAIL.Stanford.EDU with TCP; 16 Jun 89 22:25:02 PDT
Received: from EUPHRATES.SCRC.Symbolics.COM by STONY-BROOK.SCRC.Symbolics.COM via CHAOS with CHAOS-MAIL id 612487; 17 Jun 89 01:26:52 EDT
Date: Sat, 17 Jun 89 01:27 EDT
From: David A. Moon <Moon@STONY-BROOK.SCRC.Symbolics.COM>
Reply-To: CL-Cleanup@sail.stanford.edu
Subject: Issue: PATHNAME-SUBDIRECTORY-LIST (version 7)
To: X3J13@sail.stanford.edu
Message-ID: <19890617052724.9.MOON@EUPHRATES.SCRC.Symbolics.COM>
Issue: PATHNAME-SUBDIRECTORY-LIST
References: Pathnames (pp410-413), MAKE-PATHNAME (p416),
PATHNAME-DIRECTORY (p417)
Related-issues: PATHNAME-COMPONENT-CASE, PATHNAME-COMPONENT-VALUE
Category: CHANGE
Edit history: 18-Jun-87, Version 1 by Ghenis.pasa@Xerox.COM
05-Jul-88, Version 2 by Pitman (major revision)
28-Dec-88, Version 3 by Pitman (merge discussion)
22-Mar-89, Version 4 by Moon (fix based on discussion)
19-May-89, Version 5 by Moon (improve based on mail)
21-May-89, Version 6 by Moon (final cleanups)
17-Jun-89, Version 7 by Moon (add current practice
and discussion; minor fixes based on discussion)
Problem Description:
It is impossible to write portable code that can produce a pathname
in a subdirectory of a hierarchical file system. This defeats much of
the purpose of the pathname abstraction.
According to CLtL, only a string is a portable value for the directory
component of a pathname. Thus in order to denote a subdirectory, the use
of punctuation characters (such as dots, slashes, or backslashes) would
be necessary. The very fact that such syntax varies from host to host
means that although the representation might be "portable", the code
using that representation is not portable.
This problem is even worse for programs running on machines on a network
that can retrieve files from multiple hosts, each using a different OS
and thus different subdirectory punctuation.
Related problems:
- In some implementations "FOO.BAR" might denote the "BAR" subdirectory
of "FOO", while in other implementations it would denote a top-level
directory, because "." is not treated as punctuation. To be safe,
portable programs must avoid all potential punctuation characters.
- Even in implementations where "." is used for subdirectories,
"FOO.BAR" may be recognized by some to mean the "BAR" subdirectory of
"FOO" and by others to mean `a seven character directory name with "."
as the fourth character.'
- In fact, CLtL does not even say whether punctuation characters are
part of the string. eg, is "foo" or "/foo" the directory component for
a unix pathname "/foo/bar.lisp". Similarly, is "[FOO]" or "FOO" the
directory component for a VMS pathname "[FOO]ME.LSP"?
PATHNAME-COMPONENT-VALUE:SPECIFY says punctuation characters are not
part of the string.
Proposal (PATHNAME-SUBDIRECTORY-LIST:NEW-REPRESENTATION)
Remove the "structured" directory feature mentioned on CLtL p.412.
Allow the value of a pathname's directory component to be a list. The
car of the list is one of the symbols :ABSOLUTE or :RELATIVE.
Each remaining element of the list is a string or a symbol (see below).
Each string names a single level of directory structure. The strings
should contain only the directory names themselves -- no punctuation
characters.
A list whose car is the symbol :ABSOLUTE represents a directory path
starting from the root directory. The list (:ABSOLUTE) represents
the root directory. The list (:ABSOLUTE "foo" "bar" "baz") represents
the directory called "/foo/bar/baz" in Unix [except possibly for
alphabetic case -- that is the subject of a separate issue].
A list whose car is the symbol :RELATIVE represents a directory path
starting from a default directory. The list (:RELATIVE) has the same
meaning as NIL and hence is not used. The list (:RELATIVE "foo" "bar")
represents the directory named "bar" in the directory named "foo" in the
default directory.
In place of a string, at any point in the list, symbols may occur to
indicate special file notations. The following symbols have standard
meanings. Implementations are permitted to add additional objects of any
non-string type if necessary to represent features of their file systems
that cannot be represented with the standard strings and symbols.
Supplying any non-string, including any of the symbols listed below, to a
file system for which it does not make sense signals an error of type
FILE-ERROR. For example, Unix does not support :WILD-INFERIORS in
most implementations.
:WILD - Wildcard match of one level of directory structure.
:WILD-INFERIORS - Wildcard match of any number of directory levels.
:UP - Go upward in directory structure (semantic).
:BACK - Go upward in directory structure (syntactic).
:ABSOLUTE or :WILD-INFERIORS immediately followed by :UP or :BACK
signals an error.
"Syntactic" means that the action of :BACK depends only on the pathname
and not on the contents of the file system. "Semantic" means that the
action of :UP depends on the contents of the file system; to resolve
a pathname containing :UP to a pathname whose directory component
contains only :ABSOLUTE and strings requires probing the file system.
:UP differs from :BACK only in file systems that support multiple
names for directories, perhaps via symbolic links. For example,
suppose that there is a directory
(:ABSOLUTE "X" "Y" "Z")
linked to
(:ABSOLUTE "A" "B" "C")
and there also exist directories
(:ABSOLUTE "A" "B" "Q")
(:ABSOLUTE "X" "Y" "Q")
then
(:ABSOLUTE "X" "Y" "Z" :UP "Q")
designates
(:ABSOLUTE "A" "B" "Q")
while
(:ABSOLUTE "X" "Y" "Z" :BACK "Q")
designates
(:ABSOLUTE "X" "Y" "Q")
If a string is used as the value of the :DIRECTORY argument to
MAKE-PATHNAME, it should be the name of a toplevel directory and should
not contain any punctuation characters. Specifying a string, str, is
equivalent to specifying the list (:ABSOLUTE str). Specifying the symbol
:WILD is equivalent to specifying the list (:ABSOLUTE :WILD-INFERIORS),
or (:ABSOLUTE :WILD) in a file system that does not support :WILD-INFERIORS.
The PATHNAME-DIRECTORY function always returns NIL, :UNSPECIFIC, or a
list, never a string, never :WILD.
In non-hierarchical file systems, the only valid list values for the
directory component of a pathname are (:ABSOLUTE string) and
(:ABSOLUTE :WILD). :RELATIVE directories and the keywords
:WILD-INFERIORS, :UP, and :BACK are not used in non-hierarchical file
systems.
Pathname merging treats a relative directory specially. Let
<pathname> and <defaults> be the first two arguments to
MERGE-PATHNAMES. If (PATHNAME-DIRECTORY <pathname>) is a list whose
car is :RELATIVE, and (PATHNAME-DIRECTORY <defaults>) is a list, then
the merged directory is the value of
(APPEND (PATHNAME-DIRECTORY <defaults>)
(CDR ;remove :RELATIVE from the front
(PATHNAME-DIRECTORY <pathname>)))
except that if the resulting list contains a string or :WILD immediately
followed by :BACK, both of them are removed. This removal of redundant
:BACKs is repeated as many times as possible.
If (PATHNAME-DIRECTORY <defaults>) is not a list or
(PATHNAME-DIRECTORY <pathname>) is not a list whose car is :RELATIVE, the
merged directory is
(OR (PATHNAME-DIRECTORY <pathname>) (PATHNAME-DIRECTORY <defaults>))
A relative directory in the pathname argument to a function such as
OPEN is merged with *DEFAULT-PATHNAME-DEFAULTS* before accessing the
file system.
Test Cases/Examples:
(PATHNAME-DIRECTORY (PARSE-NAMESTRING "[FOO.BAR]BAZ.LSP")) ;on VMS
=> (:ABSOLUTE "FOO" "BAR")
(PATHNAME-DIRECTORY (PARSE-NAMESTRING "/foo/bar/baz.lisp")) ;on Unix
=> (:ABSOLUTE "foo" "bar")
or (:ABSOLUTE "FOO" "BAR")
If PATHNAME-COMPONENT-CASE:KEYWORD-ARGUMENT passes with a default of
:COMMON, the value is the second one shown.
(PATHNAME-DIRECTORY (PARSE-NAMESTRING "../baz.lisp")) ;on Unix
=> (:RELATIVE :UP)
(PATHNAME-DIRECTORY (PARSE-NAMESTRING "/foo/bar/../mum/baz")) ;on Unix
=> (:ABSOLUTE "foo" "bar" :UP "mum")
(PATHNAME-DIRECTORY (PARSE-NAMESTRING ">foo>**>bar>baz.lisp")) ;on LispM
=> (:ABSOLUTE "FOO" :WILD-INFERIORS "BAR")
(PATHNAME-DIRECTORY (PARSE-NAMESTRING ">foo>*>bar>baz.lisp")) ;on LispM
=> (:ABSOLUTE "FOO" :WILD "BAR")
Rationale:
This would allow programs to deal usefully with hierarchical file
systems, which are by far the most common file system type.
This would allow a system construction utility that organizes programs
by subdirectories to be portable to all implementations that have
hierarchical file systems.
Discussion indicated that "Implementations are permitted to add
additional objects of any non-string type if necessary to represent
features of their file systems that cannot be represented with the
standard strings and symbols" is a necessary escape hatch for things like
home directories and fancy pattern matching. Implementations should
limit their use of this loophole and use the standard keyword symbols
whenever that is possible.
Current Practice:
Symbolics Genera implements something very similar to this. The main
differences are:
- In Genera, there is no :ABSOLUTE keyword at the head of the list.
This has been shown to cause some problems in dealing with root
directories. Genera represents the root directory by a keyword
symbol (rather than a list) because the list representation
was not adequately general.
- Genera has no separate concepts of :UP and :BACK. Genera
represents Unix ".." as :UP, but deals with :UP syntactically, not
semantically.
On the Explorer, the directory component is a list of strings, not yet
supporting the symbols specified in proposal PATHNAME-SUBDIRECTORY-LIST.
Macintosh Allegro Common Lisp 1.2.2 uses a string with punctuation
characters instead of a list for the directory.
Lucid Common Lisp 3.0.1 under Unix uses a list for directories of
somewhat different form from what is proposed in
PATHNAME-SUBDIRECTORY-LIST. It uses :ROOT instead of :ABSOLUTE and uses
".." instead of :UP. It does use :RELATIVE.
Ibuki Common Lisp Release 01/01 uses a list for directories of somewhat
different form from what is proposed in PATHNAME-SUBDIRECTORY-LIST. It
uses :ROOT instead of :ABSOLUTE, uses :PARENT instead of :UP, and omits
the leading keyword instead of using :RELATIVE.
IIM uses a list for directories of somewhat different form from what is
proposed in PATHNAME-SUBDIRECTORY-LIST. It uses :ABSOLUTE-DIRECTORY
instead of :ABSOLUTE, uses :SUPER-DIRECTORY instead of :BACK, and omits
the leading keyword instead of using :RELATIVE.
Cost to Implementors:
In principle, nothing about the implementation needs to change except
the treatment of the directory component by MAKE-PATHNAME and
PATHNAME-DIRECTORY. The internal representation can otherwise be left
as-is if necessary.
Implementations such as Genera, Explorer, Lucid, Ibuki, and IIM that
already have hierarchical directory handling will have to make an
incompatible change to switch to what is proposed here.
For implementations that choose to rationalize this representation
throughout their internals and any other implementation-specific
accessors, the cost will be necessarily higher.
Cost to Users:
None for portable programs. This change is upward compatible with CLtL.
Nonportable programs will have to be changed if they use implementation
dependent hierarchical directory handling and the implementation
removes support for that when it adds support for this proposal.
Cost of Non-Adoption:
Serious portability problems would continue to occur. Programmers would be
driven to the use of implementation-specific facilities because the need
for this is frequently impossible to ignore.
Benefits:
The serious costs of non-adoption would be avoided.
Aesthetics:
This representation of hierarchical pathnames is easy to use and quite
general. Users will probably see this as an improvement in the aesthetics.
Discussion:
This issue was raised a while back but no one was fond of the particular
proposal that was submitted. This is an attempt to revive the issue.
The original proposal, to add a :SUBDIRECTORIES component to a
pathname, was discarded because it imposed an unnatural distinction
between a toplevel directory and its subdirectories. Pitman's guess is
the the idea was to try to make it a compatible change, but since most
programmers will probably want to change from implementation-specific
primitives to portable ones anyway, that's probably not such a big
deal. Also, there could have been some programs which thought the
change was compatible and ended up ignoring important information (the
:SUBDIRECTORIES component). Pitman thought it would be better if
people just accepted the cost of an incompatible change in order to
get something really pretty as a result.
Some people feel it is unnecessary to standardize the format of
pathname components such as the directory.
Moon doesn't like having both :UP and :BACK, but admits that some
file systems do it one way and some do it the other. He still thinks
it would be simpler if we got rid of :BACK and just had :UP.
To keep it simple, we chose not to add to this issue the functions
DIRECTORY-PATHNAME-AS-FILE and PATHNAME-AS-DIRECTORY, which convert
the name of a directory from or to the directory component of a file
inferior to that directory. This conversion is system-dependent, for
example TOPS-20 appends a type field and Unix does not. Also in some
systems the root directory has a name and in others it doesn't. Of
course these functions signal an error in non-hierarchical file
systems. Examples (for Unix, assuming #P print syntax for pathnames):
(directory-pathname-as-file #P"/usr/bin/sh") => #P"/usr/bin"
(pathname-as-directory #P"/usr/bin") => #P"/usr/bin"/
These functions have not been proposed because they are mainly useful
in conjunction with additional functions for manipulating directories
(creating, expunging, setting access control) that have not been made
available in Common Lisp.
∂23-Mar-89 2059 X3J13-mailer **DRAFT** Issue: PATHNAME-SYNTAX-ERROR-TIME (Version 1)
Received: from STONY-BROOK.SCRC.Symbolics.COM (SCRC-STONY-BROOK.ARPA) by SAIL.Stanford.EDU with TCP; 23 Mar 89 20:59:16 PST
Received: from BOBOLINK.SCRC.Symbolics.COM by STONY-BROOK.SCRC.Symbolics.COM via CHAOS with CHAOS-MAIL id 563835; Thu 23-Mar-89 15:19:53 EST
Date: Thu, 23 Mar 89 15:19 EST
From: Kent M Pitman <KMP@STONY-BROOK.SCRC.Symbolics.COM>
Subject: **DRAFT** Issue: PATHNAME-SYNTAX-ERROR-TIME (Version 1)
To: X3J13@SAIL.Stanford.EDU
Message-ID: <890323151934.2.KMP@BOBOLINK.SCRC.Symbolics.COM>
>>> PLEASE DO -NOT- REPLY TO THIS ISSUE <<<
Ponder it and bring your comments to the meeting.
See summary of CL-Cleanup discussion at end.
-kmp
-----
Issue: PATHNAME-SYNTAX-ERROR-TIME
References: File System Interface (pp409-427)
Category: CLARIFICATION
Edit history: 07-Jul-88, Version 1 by Pitman
Status: For Internal Discussion
Problem Description:
There exist conceivable pathnames for which there is no valid mapping in a
particular implementation. CLtL is not clear about the time at which this
error might be detected.
For example, on file systems which constrain pathname-types to be three
letters or fewer, the type "LISP" is not valid. The question arises: when
is this error detected?
In some implementations, the error might be detected while forming the
pathname. That is, (MAKE-PATHNAME :TYPE "LISP") signals an error.
In some implementations, the error might be detected while forming the
namestring. That is, (MAKE-PATHNAME :TYPE "LISP") succeeds, but
(NAMESTRING (MAKE-PATHNAME :TYPE "LISP")) signals an error.
In some implementations, validity checking might be done only by the host
operating system, so Lisp does not detect the error unless the operating
system complains. For example, (MAKE-PATHNAME :TYPE "LISP") succeeds,
and even (NAMESTRING (MAKE-PATHNAME :TYPE "LISP")) constructs a plausible
looking pathname, but (OPEN (NAMESTRING (MAKE-PATHNAME :TYPE "LISP"))) fails.
In some implementations, Lisp might make `friendly' corrections to the
pathname in order to form a namestring. For example,
(MAKE-PATHNAME :TYPE "LISP") might succeed, but
(NAMESTRING (MAKE-PATHNAME :TYPE "LISP")) might produce a namestring with
an extension such as ".LIS" or ".LSP".
Similar issues might come up in file systems which don't allow wildcard
pathnames. Is :WILD allowed in a name or type slot and then disallowed
upon coercion to a pathname, or is :WILD complained about "up front"?
This phenomenon is a barrier to portability because if a program is
debugged in an implementation that does, for example, NAMESTRING-time
error checking, the programmer may be lulled into an expectation that
it is acceptable to construct and manipulate invalid pathnames as long
as the problem is caught before an attempt to call NAMESTRING is
attempted. On the other hand, another programmer may debug his code in
a Lisp which does early error checking of syntax and may assume that
he is home free if the pathname gets constructed correctly.
Proposal (PATHNAME-SYNTAX-ERROR-TIME:PATHNAME-CREATION):
Clarify that operations such as MAKE-PATHNAME and MERGE-PATHNMES which
construct new pathnames do plausibility checking of their arguments
and signal an error rather than construct a pathname for which NAMESTRING
would not produce a valid pathname.
Rationale:
This would identify clearly to the programmer where he should expect an
error to be signalled for a pathname.
This would mean that fully constructed pathnames could reliably
be converted to namestrings.
Cost to Implementors:
Some implementors, especially those which rely on the operating system
to be the sole authority on pathname syntax, might have to introduce
some new syntax-checking facilities.
Implementations where this error checking is done later would have to be
changed both to do it earlier, and to not make the unwarranted assumption
that pathnames with no valid namestring representation are constructable.
Cost to Users:
The ability to represent non-viable pathnames for the purpose of merging
would be lost. This feature was not portably available, but was available
in some operating systems.
Some code which expected an error, but expected it at a different time
would have to be changed.
Proposal (PATHNAME-SYNTAX-ERROR-TIME:NAMESTRING-COERCION):
Clarify that it was valid to create a pathname which could not be
converted to a namestring. Require NAMESTRING (and related functions,
such as ENOUGH-NAMESTRING or any internal functions that might be used
in place of NAMESTRING by functions like OPEN and PROBE-FILE) to signal
an error for pathnames which do not represent valid filenames in the
designated file system.
Rationale:
This would identify clearly to the programmer where he should expect an
error to be signalled for a pathname.
This would allow the construction of pathnames for the sole purpose of
merging without causing what might seem to some as gratuitous errors.
Cost to Implementors:
Implementors who rely on the operating system to be the sole authority
on pathname syntax, might have to introduce some new syntax-checking
facilities.
Implementations where this error checking is done earlier would have to
be changed both to do it later, and to not make the unwarranted
assumption that any pathname has a valid namestring representation.
Cost to Users:
Early error checking of faulty pathnames would be lost.
Some code which expected an error, but expected it at a different time
would have to be changed.
Benefits:
Macsyma, for example, has encountered a need for "hostless" pathnames
(in merging). The concept makes no sense if every pathname must have
a namestring, because a pathname with no host cannot have a namestring.
However, if it's NAMESTRING's responsibility to signal an error, then
hostless pathnames are still useful for merging. Consider:
(MERGE-PATHNAMES (MAKE-PATHNAME :NAME "FRED") MARY)
This will override both the NAME and the HOST field of MARY because you
must currently have a host in every pathname. But if MAKE-PATHNAME did
not force the host, or if one could explicitly say :HOST NIL, then
such pathnames would be considerably more useful for merging.
Proposal (PATHNAME-SYNTAX-ERROR-TIME:EXPLICITLY-VAGUE):
Clarify that we were unable to reach agreement on this issue and that
the time at which this error detection occurs is not well-specified.
Advise the editorial group to warn users clearly about this known source
of program portability problems.
Rationale:
This implements the status quo.
Cost to Implementors:
None.
Cost to Users:
No existing code must be modified, but there is an ongoing cost
associated with providing error checking at multiple points in a
program because implementations disagree as to where an error
might be signalled. In some cases, the effects of having to handle
this in multiple places may cause unpleasant modularity violations.
Test Case:
See problem description.
Current Practice:
Symbolics Genera signals an error at pathname construction time if a
pathname will be invalid. Once a pathname is successfully constructed,
it can generally be assumed that NAMESTRING will always succeed.
Aesthetics:
Making this more well-defined would cause a definite aesthetic
improvement to some programs.
Discussion:
Pitman prefers PATHNAME-SYNTAX-ERROR-TIME:NAMESTRING-COERCION but
believes that anything is an improvement over ...:EXPLICITLY-VAGUE.
CL pathname functions were not adequate for use in Macsyma because
they did not adequately represent to-be-merged-only pathnames (a
feature used very extensively in Macsyma), because errors could be
signalled at radically different times. To get around this, Pitman
had to create a data structure in Macsyma called an MPATHNAME which
was only trivially different than a PATHNAME but which made it
possible to deal portably with this issue of when errors occurred
and what kinds of errors occured. Unfortunately, since none of the
CL functions worked on MPATHNAMEs, a whole series of functions,
also only trivially different, had to be created: MAKE-MPATHNAME,
MNAMESTRING, MERGE-MPATHNAMES, MPATHNAME-NAME, MPATHNAME-TYPE,
MOPEN, WITH-MOPEN-FILE, etc.
------
Summary of CL-Cleanup discussion:
Most of the mail was endorsements for option PATHNAME-CREATION.
Sandra brought up a tangential issue about truenames that eventually
became a separate issue.
I think I'm the only person pushing NAMESTRING-COERCION. I strongly
believe it is the right thing, and that PATHNAME-CREATION is suboptimal,
based on problems that have struck me with existing CL pathname system.
However, even PATHNAME-CREATION would be an improvement from a
portability standpoint and I am probably not going to push it because
there are compatibility issues on the side of PATHNAME-CREATION (many
implementations do this already), and because there are more important
issues for us to spend time on at the meeting.
[Please try to come prepared to vote yes on one or both of
PATHNAME-CREATION or NAMESTRING-COERCION so we don't have to fall back
on EXPLICITLY-VAGUE, which is a total loss for program portability.
-kmp]
∂16-Jun-89 2126 X3J13-mailer Issue: PATHNAME-SYSTEM-TYPE (version 2)
Received: from STONY-BROOK.SCRC.Symbolics.COM by SAIL.Stanford.EDU with TCP; 16 Jun 89 21:26:05 PDT
Received: from EUPHRATES.SCRC.Symbolics.COM by STONY-BROOK.SCRC.Symbolics.COM via CHAOS with CHAOS-MAIL id 612467; 17 Jun 89 00:27:59 EDT
Date: Sat, 17 Jun 89 00:28 EDT
From: David A. Moon <Moon@STONY-BROOK.SCRC.Symbolics.COM>
Reply-To: CL-Cleanup@sail.stanford.edu
Subject: Issue: PATHNAME-SYSTEM-TYPE (version 2)
To: X3J13@sail.stanford.edu
Message-ID: <19890617042828.6.MOON@EUPHRATES.SCRC.Symbolics.COM>
Issue: PATHNAME-SYSTEM-TYPE
References: Internet RFC 1010, pages 24-25
Related issues: PATHNAME-LOGICAL
Category: ADDITION
Edit history: 23-May-89, Version 1 by Moon
17-Jun-89, Version 2 by Moon (add Macintosh, add discussion)
Problem description:
It is sometimes necessary to write nonportable pathname manipulation code
that performs operations specific to individual file systems. Sometimes
this is to get around inadequacies of the Common Lisp pathname model,
sometimes it is to take advantage of idiosyncratic features of a
particular file system. Common Lisp does not provide any way to ask what
file system a pathname is for, so there is no good way for this type of
pathname manipulation code to be sure what file system it is dealing
with. Sometimes it can tell by checking what Lisp implementation it is
running in, but as more and more implementations support network file
access, this is becoming less reliable.
Proposal (PATHNAME-SYSTEM-TYPE:ADD-FUNCTION):
Add the following function:
PATHNAME-SYSTEM-TYPE pathname [Function]
Coerce the pathname argument to a pathname, signalling an error of type
TYPE-ERROR if the argument is not a pathname, string, or file stream.
Return a keyword symbol that identifies the type of file system this
pathname is for. The names of these symbols are derived from the
system type names used by the Internet Domain Name system, listed in
the referenced document. Implementations that use a file system listed
in that document, or superseding documents, should return a symbol in
the keyword package whose name comes from that document. Examples:
:MSDOS MS/DOS or PC/DOS
:TOPS10 TOPS-10
:TOPS20 TOPS-20
:ULTRIX Ultrix
:UNIX Unix with long file names (4.2 or newer)
:VM/370 VM/370
:VMS VAX/VMS with long file names (version 4.4 or newer)
:XENIX Xenix
The following additional symbols are specified by Common Lisp:
:LOGICAL logical pathname (see issue PATHNAME-LOGICAL)
:MACINTOSH MacOS (missing from RFC 1010 for some reason)
:UNIX-14 Unix with 14-character file name limit
:VMS-9 VAX/VMS with 9-character file name limit
NIL system type cannot be determined
For file systems not named in the referenced document, implementations
should make up a name consistent with the spelling conventions defined
in that document.
Examples:
;; On a non-networked IBM PC:
(PATHNAME-SYSTEM-TYPE (USER-HOMEDIR-PATHNAME)) => :MSDOS
Rationale:
PATHNAME-SYSTEM-TYPE gives a nonportable pathname manipulation program
the basic information it needs to interpret namestrings and manipulate
pathname components.
Current practice:
Symbolics Genera has had a similar feature under a different name
for many years. A few of the keyword symbols returned by Genera
are spelled differently, merely for historical reasons.
Cost to Implementors:
Trivial.
Cost to Users:
None.
Cost of non-adoption:
Implementation-dependent kludges will have to be used.
Performance impact:
None.
Benefits:
Improved esthetics.
Esthetics:
Implementation-dependent kludges will not have to be used.
Discussion:
Some people feel that this feature is unnecessary.
∂19-Jun-89 1545 X3J13-mailer Issue: PATHNAME-WILD (version 6)
Received: from STONY-BROOK.SCRC.Symbolics.COM by SAIL.Stanford.EDU with TCP; 19 Jun 89 15:45:22 PDT
Received: from EUPHRATES.SCRC.Symbolics.COM by STONY-BROOK.SCRC.Symbolics.COM via CHAOS with CHAOS-MAIL id 613310; 19 Jun 89 17:41:50 EDT
Date: Mon, 19 Jun 89 17:42 EDT
From: David A. Moon <Moon@STONY-BROOK.SCRC.Symbolics.COM>
Reply-To: CL-Cleanup@sail.stanford.edu
Subject: Issue: PATHNAME-WILD (version 6)
To: X3J13@sail.stanford.edu
Message-ID: <19890619214210.6.MOON@EUPHRATES.SCRC.Symbolics.COM>
This issue is on the agenda for the June X3J13 meeting. KMP and I
have prepared a revised writeup which we think is ready for release.
Issue: PATHNAME-WILD
Forum: Cleanup
References: Pathnames (pp410-413)
Related issues: PATHNAME-COMPONENT-VALUE, PATHNAME-LOGICAL
Category: ADDITION
Edit history: 21-Jul-88, Version 1 by Pitman
06-Oct-88, Version 2 by Pitman
9-May-89, Version 3 by Moon (small fixes)
10-May-89, Version 4 by Moon (add two more functions)
13-May-89, Version 5 by Moon (minor cleanups, add clarification)
19-Jun-89, Version 6 by Moon (revise based on extensive
discussion in the cleanup subcommittee; rewrite
the description of TRANSLATE-PATHNAME so it is
possible to understand it)
Problem Description:
Some file systems provide more complex conventions for wildcards than
simple component-wise wildcards (:WILD). For example,
"F*O" might mean:
- a normal three character name
- a three-character name, with the middle char wild
- at least a two-character name, with the middle 0 or more chars wild
- a wild match spanning multiple directories
">foo>*>bar" might imply:
- the middle directory is named "*"
- the middle directory is :WILD
- there may be zero or more :WILD middle directories
- the middle directory name matches any one-letter name
">foo>**>bar" might mean
- the middle directory is named "**"
- the middle directory is :WILD
- there may be zero or more :WILD middle directories
- the middle directory name matches any two-letter name
Some file systems support even more complex wildcards, for example
regular expressions.
The CL pathname model does not specify a way to represent complex
wildcards, which means, for example, that (MAKE-PATHNAME :NAME "F*O")
cannot be recognized by portable code as containing a wildcard.
Common Lisp provides only the first of these four common operations
on wildcard pathnames:
(1) Enumerate the set of existing files that match the pathname;
this is provided by the DIRECTORY function.
(2) Test whether a pathname contains wildcards.
(3) Test whether a pathname matches a wildcard pathname.
(4) Translate one pathname into another according to a mapping specified
by a pair of wildcard pathnames.
Proposal (PATHNAME-WILD:NEW-FUNCTIONS):
Introduce the following three functions:
1. WILD-PATHNAME-P pathname &optional field-key
Tests a pathname for the presence of wildcard components. If the first
argument is not a pathname, string, or file stream an error of type
TYPE-ERROR is signalled.
If no <field-key> is provided, or the <field-key> is NIL, the result is
T if <pathname> has any wildcard components, NIL if <pathname> has none.
If a non-null <field-key> is provided, it must be one of :HOST, :DEVICE,
:DIRECTORY, :NAME, :TYPE, or :VERSION. In this case, the result is T
if the indicated component of <pathname> is a wildcard, NIL if the
component is not a wildcard. Note that not all implementations
support wildcards in all fields, according to PATHNAME-COMPONENT-VALUE.
2. PATHNAME-MATCH-P pathname wildcard
T if <pathname> matches <wildcard>, otherwise NIL. The matching rules
are implementation-defined but should be consistent with the
DIRECTORY function. Missing components of <wildcard> default to :WILD.
If either argument is not a pathname, string, or file stream an error
of type TYPE-ERROR is signalled. It is valid for <pathname> to be a
wild pathname; a wildcard field in <pathname> will only match a
wildcard field in <wildcard>, i.e. the function is not commutative.
It is valid for <wildcard> to be a non-wild pathname.
3. TRANSLATE-PATHNAME source from-wildcard to-wildcard &optional reversible
Translates the pathname <source>, which matches <from-wildcard>, into
a corresponding pathname <result>, which matches <to-wildcard>, and
returns <result>.
The pathname <result> is <to-wildcard> with each wildcard or missing
field replaced by a portion of <source>. A "wildcard field" is a
pathname component with a value of :WILD, a :WILD element of a
list-valued directory component, or an implementation-defined portion
of a component, such as the "*" in the complex wildcard string
"foo*bar" that some implementations support. An implementation that
adds other wildcard features, such as regular expressions, must define
how TRANSLATE-PATHNAME extends to those features. A "missing field" is
a pathname component with a value of NIL.
The portion of <source> that is copied into <result> is implementation
defined. Typically it is determined by the user interface conventions
of the file systems involved. Usually it is the portion of <source>
that matches a wildcard field of <from-wildcard> that is in the same
position as the wildcard or missing field of <to-wildcard>. If there
is no wildcard field in <from-wildcard> at that position, then usually
it is the entire corresponding pathname component of <source>, or in
the case of a list-valued directory component, the entire corresponding
list element. For example, if the name components of <source>,
<from-wildcard>, and <to-wildcard> are "gazonk", "gaz*", and "h*"
respectively, then in most file systems, the wildcard fields of the
name component of <from-wildcard> and <to-wildcard> are each "*", the
matching portion of <source> is "onk", and the name component of
<result> is "honk". However, the exact behavior of TRANSLATE-PATHNAME
cannot be dictated by the Common Lisp language and must be allowed to
vary, depending on the user interface conventions of the file systems
involved.
During the copying of a portion of <source> into <result>, additional
implementation-defined translations of alphabetic case or file naming
conventions might occur, especially when <from-wildcard> and
<to-wildcard> are for different hosts.
If <reversible> is true, the translation must be reversible, that is,
the following identity must hold for all cases where no error is
signalled:
(equal (translate-pathname (translate-pathname pathname from to t)
to from t)
pathname)
In some file systems the above identity is true only when
(member (pathname-version pathname) '(:newest :unspecific)).
This is considered valid, as Common Lisp cannot force all the
file systems in the world to implement versions.
If <reversible> is false (which is the default), the translation is
determined by the user interface conventions of the file systems
involved and is not necessarily reversible. In some file systems the
<reversible> argument is ignored because the user interface conventions
are reversible anyway.
If any of the first three arguments is not a pathname, string, or file
stream an error of type TYPE-ERROR is signalled. It is valid for
<source> to be a wild pathname; in general this will produce a wild
result. It is valid for <from-wildcard> and/or <to-wildcard> to be
non-wild pathnames. (PATHNAME-MATCH-P <source> <from-wildcard>) must
be true or an error is signalled.
Implementation guideline: one file system performs this operation by
examining each piece of the three pathnames in turn, where a piece is a
pathname component or a list element of a structured component such as
a hierarchical directory. Hierarchical directory elements in
<from-wildcard> and <to-wildcard> are matched by whether they are
wildcards, not by depth in the directory hierarchy. If the piece in
<to-wildcard> is present and not wild, it is copied into the result.
If the piece in <to-wildcard> is :WILD or NIL, and either <reversible>
is false or the piece in <from-wildcard> is not a complex wildcard, the
piece in <source> is copied into the result. Otherwise, the piece in
<to-wildcard> might be a complex wildcard such as "foo*bar" and the
piece in <from-wildcard> should be wild; the portion of the piece in
<source> that matches the wildcard portion of the piece in
<from-wildcard> replaces the wildcard portion of the piece in
<to-wildcard> and the value produced is used in the result.
4. Clarify that the functions OPEN (and WITH-OPEN-FILE), PROBE-FILE,
FILE-WRITE-DATE, FILE-AUTHOR, and TRUENAME only accept non-wildcard
pathnames and signal an error if given a pathname for which
WILD-PATHNAME-P returns true.
5. Clarify that the functions RENAME-FILE, DELETE-FILE, LOAD, and
COMPILE-FILE have implementation-defined consequences when given a
wildcard pathname. Each function might signal an error or might operate
on all files that match the wildcard pathname.
Examples:
;The following examples are not portable. They are written to run
;with particular file systems and particular wildcard conventions.
;Other implementations will behave differently. These examples are
;intended to be illustrative, not to be prescriptive.
(WILD-PATHNAME-P (MAKE-PATHNAME :NAME :WILD)) => T
(WILD-PATHNAME-P (MAKE-PATHNAME :NAME :WILD) :NAME) => T
(WILD-PATHNAME-P (MAKE-PATHNAME :NAME :WILD) :TYPE) => NIL
(WILD-PATHNAME-P (PATHNAME "S:>foo>**>")) => T ;Lispm
(WILD-PATHNAME-P (PATHNAME :NAME "F*O")) => T ;Most places
;This example assumes one particular set of wildcard conventions
;Not all file systems will run this example exactly as written
(DEFUN RENAME-FILES (FROM TO)
(DOLIST (FILE (DIRECTORY FROM))
(RENAME-FILE FILE (TRANSLATE-PATHNAME FILE FROM TO))))
(RENAME-FILES "/usr/me/*.lisp" "/dev/her/*.l")
;Renames /usr/me/init.lisp to /dev/her/init.l
(RENAME-FILES "/usr/me/pcl*/*" "/sys/pcl/*/")
;Renames /usr/me/pcl-5-may/low.lisp to /sys/pcl/pcl-5-may/low.lisp
;In some file systems the result might be /sys/pcl/5-may/low.lisp
(RENAME-FILES "/usr/me/pcl*/*" "/sys/library/*/")
;Renames /usr/me/pcl-5-may/low.lisp to /sys/library/pcl-5-may/low.lisp
;In some file systems the result might be /sys/library/5-may/low.lisp
(RENAME-FILES "/usr/me/foo.bar" "/usr/me2/")
;Renames /usr/me/foo.bar to /usr/me2/foo.bar
(RENAME-FILES "/usr/joe/*-recipes.text" "/usr/jim/cookbook/joe's-*-rec.text")
;Renames /usr/joe/lamb-recipes.text to /usr/jim/cookbook/joe's-lamb-rec.text
;Renames /usr/joe/pork-recipes.text to /usr/jim/cookbook/joe's-pork-rec.text
;Renames /usr/joe/veg-recipes.text to /usr/jim/cookbook/joe's-veg-rec.text
;This example assumes one particular set of wildcard conventions and
;illustrates how and why reversible translation uses different rules
(PATHNAME-NAME (TRANSLATE-PATHNAME "foobar" "foo*" "*baz" NIL)) => "barbaz"
(PATHNAME-NAME (TRANSLATE-PATHNAME "foobar" "foo*" "*baz" T)) => "barbaz"
(PATHNAME-NAME (TRANSLATE-PATHNAME "foobar" "foo*" "*" NIL)) => "foobar"
(PATHNAME-NAME (TRANSLATE-PATHNAME "foobar" "foo*" "*" T)) => "bar"
(PATHNAME-NAME (TRANSLATE-PATHNAME "foobar" "*" "foo*" NIL)) => "foofoobar"
(PATHNAME-NAME (TRANSLATE-PATHNAME "foobar" "*" "foo*" T)) => "foofoobar"
(PATHNAME-NAME (TRANSLATE-PATHNAME "bar" "*" "foo*" NIL)) => "foobar"
(PATHNAME-NAME (TRANSLATE-PATHNAME "bar" "*" "foo*" T)) => "foobar"
;Using Unix syntax and the wildcard conventions used by the
;particular version of Unix on which I tried this:
(NAMESTRING
(TRANSLATE-PATHNAME "/usr/dmr/hacks/frob.l"
"/usr/d*/hacks/*.l"
"/usr/d*/backup/hacks/backup-*.*"))
=> "/usr/dmr/backup/hacks/backup-frob.l"
(NAMESTRING
(TRANSLATE-PATHNAME "/usr/dmr/hacks/frob.l"
"/usr/d*/hacks/fr*.l"
"/usr/d*/backup/hacks/backup-*.*"))
=> "/usr/dmr/backup/hacks/backup-ob.l"
;This is similar to the above example but uses two different hosts,
;U: which is a Unix and V: which is a VMS. Note the translation
;of file type and alphabetic case conventions.
(NAMESTRING
(TRANSLATE-PATHNAME "U:/usr/dmr/hacks/frob.l"
"U:/usr/d*/hacks/*.l"
"V:SYS$DISK:[D*.BACKUP.HACKS]BACKUP-*.*"))
=> "V:SYS$DISK:[DMR.BACKUP.HACKS]BACKUP-FROB.LSP"
(NAMESTRING
(TRANSLATE-PATHNAME "U:/usr/dmr/hacks/frob.l"
"U:/usr/d*/hacks/fr*.l"
"V:SYS$DISK:[D*.BACKUP.HACKS]BACKUP-*.*"))
=> "V:SYS$DISK:[DMR.BACKUP.HACKS]BACKUP-OB.LSP"
;This example presumes background information described in PATHNAME-LOGICAL
(DEFUN TRANSLATE-LOGICAL-PATHNAME-1 (PATHNAME RULES)
(LET ((RULE (ASSOC PATHNAME RULES :TEST #'PATHNAME-MATCH-P)))
(UNLESS RULE (ERROR "No translation rule for ~A" PATHNAME))
(TRANSLATE-PATHNAME PATHNAME (FIRST RULE) (SECOND RULE) T)))
(TRANSLATE-LOGICAL-PATHNAME-1 "FOO:CODE;BASIC.LISP"
'(("FOO:DOCUMENTATION;" "MY-UNIX:/doc/foo/")
("FOO:CODE;" "MY-UNIX:/lib/foo/")
("FOO:PATCHES;*;" "MY-UNIX:/lib/foo/patch/*/")))
=> the pathname MY-UNIX:/lib/foo/basic.l
Rationale:
1,2,3. These three functions provide a standardized interface to the
idiosyncratic wildcard functionality of each host file system.
1. WILD-PATHNAME-P makes it possible to detect wild pathnames reliably and
do something useful (give up, merge out the bothersome components, call
DIRECTORY for a list of matching pathnames, etc.)
2,3. TRANSLATE-PATHNAME is needed by many application programs that deal with
wildcard pathnames. PATHNAME-MATCH-P and TRANSLATE-PATHNAME are needed
by logical pathnames. The reversible feature is needed by logical
pathnames. The PATHNAME-LOGICAL proposal cannot be implemented without
these features.
4. Since these functions return a value connected with one file, there
is no meaningful way to extend them to work on wildcard pathnames. It
seems best to specify that they signal an error, rather than leaving
the consequences undefined.
5. The consequences are proposed to be implementation-defined because
current practice varies and no one wants to change.
Current Practice:
Presumably no implementation supports the proposal exactly as stated.
Symbolics Genera has had similar features under different names for many
years:
(SEND pathname :WILD-P) returns a value such as NIL, :NAME, :TYPE,
etc., indicating the first wild field.
(SEND pathname :NAME-WILD-P), (SEND pathname :DIRECTORY-WILD-P),
etc. test individual fields.
The :TRANSLATE-WILD-PATHNAME, :TRANSLATE-WILD-PATHNAME-REVERSIBLE, and
:PATHNAME-MATCH messages resemble TRANSLATE-PATHNAME and
PATHNAME-MATCH-P.
The Explorer also supports the messages :WILD-P (although it only
returns NIL or T), :NAME-WILD-P, etc., :TRANSLATE-WILD-PATHNAME, and
:PATHNAME-MATCH.
Points 4 and 5 are current practice as far as the authors are aware.
The Explorer permits DELETE-FILE on a wild pathname, meaning to delete
all files that match.
Cost to Implementors:
Many implementations probably have a substrate which is capable of this
or something similar already. In such cases, it's a relatively small
matter to add the proposed interface.
Even in cases where an implementation doesn't have ready code, it's clearly
better for the implementor to write that code once and for all than to ask
each user of wildcards to write it.
Since the detailed behavior is at the implementor's discretion, the cost
is unlikely to be large. Some file systems will do all the work and the
implementor need only provide an interface to the file system or to a
standard library routine. For other file systems the implementor has to
write the actual matching and translation algorithms.
Cost to Users:
None. This change is upward compatible.
Cost of Non-Adoption:
Wild pathnames would continue to be mistaken for ordinary pathnames in
many situations. User programs that deal with wildcard pathnames would
have to operate on implementation-dependent representations and hence
would not be easily portable.
The biggest cost is that the logical pathnames proposal would be stymied.
Performance Impact:
None.
Benefits:
A more complete set of wildcard pathname operations. Portable user
programs that deal with wildcard pathnames will be more consistent
and reliable. A portable system construction tool can be written
and the foundations are laid for a `logical pathname' facility
(proposed separately in PATHNAME-LOGICAL).
Aesthetics:
This change would make some portable code less kludgey.
Discussion:
There was some question about the name. The name PATHNAME-WILD-P
suggests a ``slot'' of a pathname (like PATHNAME-HOST),
while WILD-PATHNAME-P suggests a type (like INPUT-STREAM-P).
The committee was split on what to call it. Since it is more
like a type than a slot, the name WILD-PATHNAME-P was chosen.
It's been suggested that WILD-PATHNAME-P and PATHNAME-MATCH-P be allowed
to return a value other than T to represent "truth", which would
somehow encode some additional information.
∂19-Jun-89 0911 X3J13-mailer Issue: BIT-ARRAY-FUNCTIONS (version 6)
Received: from STONY-BROOK.SCRC.Symbolics.COM by SAIL.Stanford.EDU with TCP; 19 Jun 89 09:11:10 PDT
Received: from EUPHRATES.SCRC.Symbolics.COM by STONY-BROOK.SCRC.Symbolics.COM via CHAOS with CHAOS-MAIL id 612952; 19 Jun 89 12:13:02 EDT
Date: Mon, 19 Jun 89 12:13 EDT
From: David A. Moon <Moon@STONY-BROOK.SCRC.Symbolics.COM>
Reply-To: CL-Cleanup@sail.stanford.edu
Subject: Issue: BIT-ARRAY-FUNCTIONS (version 6)
To: X3J13@sail.stanford.edu
Message-ID: <19890619161334.2.MOON@EUPHRATES.SCRC.Symbolics.COM>
This is a new issue. It arose from an investigation of features
that are plausibly needed but missing from draft ANSI Common Lisp.
This issue seems sufficiently simple and noncontroversial that
I would like to see it on the agenda for the June X3J13 meeting.
If the discussion gets lengthy, indicating that it actually is
controversial, then we should just drop it, rather than letting it
take up too much time.
Issue: BIT-ARRAY-FUNCTIONS
References: The binary bit-array functions BIT-AND, BIT-IOR, BIT-XOR,
BIT-EQV, BIT-NAND, BIT-NOR, BIT-ANDC1, BIT-ANDC2, BIT-ORC1,
and BIT-ORC2 (CLtL p.294).
The unary bit-array function BIT-NOT (CLtL p.295).
The mapping functions EVERY, MAP, NOTANY, NOTEVERY, and SOME
(CLtL pp.249-250).
The functions COUNT and POSITION (CLtL p.257).
Related issues: none
Category: ADDITION
Edit history: Version 1, 9-May-89, by Moon
Version 2, 10-May-89, by Moon (add second proposal)
Version 3, 12-May-89, by Moon (small wording improvements)
Version 4, 13-May-89, by Moon (make more understandable)
Version 5, 23-May-89, by Moon (fix -p naming convention)
Version 6, 19-Jun-89, by Moon (small fixes based on comments
from Kim Barrett)
Problem description:
Logical operations on bit vectors have been found to be useful in such
programs as compiler flow analysis. They are easy to implement in
straight Common Lisp, but such an implementation is many times slower
than an optimized implementation on most machines. This is partly
because many machines have instructions to perform these operations or
inner kernels of them, and partly because Common Lisp is not a good
language for implementing this type of low-level bit-oriented operation.
Common Lisp provides some logical operations on bit arrays, but the
provided set is incomplete. Furthermore, the operations that are
provided are only defined for arrays of identical dimensions, making them
less useful for bit vectors that represent sets, where trailing zero
elements are often omitted. Some of the sequence functions are useful
for bit vectors, but users (correctly) fear that their implementation may
be optimized for general sequences, not for bit vectors.
CLtL does not specify whether BIT-AND and related functions respect the
fill-pointer, however the description of fill-pointers on p.295 implies
that they should.
This issue contains two alternative proposals.
Proposal (BIT-ARRAY-FUNCTIONS:ADD):
1. Allow the binary bit-array functions referenced above to accept
arguments of identical rank but unequal dimensions. Nonexistent elements
of bit-array-1 or bit-array-2 are assumed to be zero. If the third
argument is T or a bit-array, result elements outside the bounds of the
array must be zero or an error should be signalled. If the third
argument is NIL or omitted, each dimension of the result array is equal
to either the corresponding dimension of bit-array-1 or the corresponding
dimension of bit-array-2. The larger of the two dimensions is used when
necessary to hold all the nonzero elements of the result, otherwise
either the larger or the smaller of the two dimensions is used.
2. Allow BIT-NOT with a bit array as the second argument to accept
arguments of identical rank but unequal dimensions. Result elements
outside the bounds of the array must be zero or an error should be
signalled.
3. Add the following functions:
BIT-SUBSETP bit-array-1 bit-array-2
Returns true if for every element of bit-array-1 that is 1, the
element with the same subscripts exists in bit-array-2 and is 1.
Bit-array-1 and bit-array-2 must have identical rank but need not
have identical dimensions.
BIT-DISJOINTP bit-array-1 bit-array-2
Returns true if for every element of one bit-array that is 1, the
element with the same subscripts either does not exist in the other
bit-array or is 0. Bit-array-1 and bit-array-2 must have identical
rank but need not have identical dimensions.
BIT-EQUAL bit-array-1 bit-array-2
Returns true if for every element of one bit-array that is 1, the
element with the same subscripts exists in the other bit-array and
is 1. Bit-array-1 and bit-array-2 must have identical rank but need
not have identical dimensions.
4. Specify that the binary bit-array functions referenced above, the
unary bit-array function referenced above, and the three bit-array
functions referenced in point 3 respect the fill-pointer of any argument
that is one-dimensional and has a fill-pointer.
5. Suggest in the language specification document that compilers should
optimize the following functions when the sequence argument is declared
to be a bit-vector, taking advantage of any relevant special machine
instructions.
COUNT
POSITION
6. Suggest in the language specification document that compilers should
optimize the following functions when there are two arguments, the second
argument is declared to be a bit-vector, and the predicate argument is
#'ZEROP, taking advantage of any relevant special machine instructions.
EVERY
NOTANY
NOTEVERY
SOME
Proposal (BIT-ARRAY-FUNCTIONS:NO-NEW-FUNCTIONS):
Points 1, 2, 4, 5, and 6 are the same as BIT-ARRAY-FUNCTIONS:ADD.
Substitute for point 3:
3. Do not add the three new functions. Instead, generalize the mapping
functions referenced above (EVERY, MAP, NOTANY, NOTEVERY, and SOME) so
that they operate on "mappables" rather than just sequences. Define a
mappable to be an array or a list. Specify that the mappable arguments
to a mapping function, and the result in the case of MAP with a non-NIL
first argument, must all be of the same rank (the rank of a list is
considered to be 1). Mapping accesses array elements in row-major order.
Generalize the existing specification that a mapping function uses the
length of the shortest sequence, to say that a mapping function uses on
each axis the minimum of the dimensions on that axis of the mappable
arguments.
Additional point 7:
7. Suggest in the language specification document that compilers should
optimize the functions EVERY, NOTANY, NOTEVERY, and SOME when there are
two arguments, the second argument is declared to be a bit-array, and the
predicate argument is #'ZEROP, taking advantage of any relevant special
machine instructions. In addition compilers should optimize when the
second argument is a call with two arguments to one of the binary
bit-array functions referenced above, to avoid consing an intermediate
result.
Examples:
The equivalents of UNION and INTERSECTION for sets represented
as bit vectors, with 1's in positions where set elements are
present, are BIT-IOR and BIT-AND respectively.
(COUNT 1 (THE BIT-VECTOR BV)) computes the cardinality of a bit
vector (called the population on some computers). This is
analogous to LOGCOUNT of an integer.
(POSITION BIT (THE BIT-VECTOR BV)) scans for a 1 or 0 bit, but
can likely be implemented using a word-at-a-time scan.
(EVERY #'ZEROP (THE BIT-VECTOR BV)) tests whether a bit vector
is entirely zero.
(BIT-SUBSETP bit-array-1 bit-array-2) is equivalent to
(EVERY #'ZEROP (BIT-ANDC2 bit-array-1 bit-array-2)) [under
the first proposal, only when the arguments are of rank 1.]
(BIT-DISJOINTP bit-array-1 bit-array-2) is equivalent to
(EVERY #'ZEROP (BIT-AND bit-array-1 bit-array-2)) [under
the first proposal, only when the arguments are of rank 1.]
This is analogous to NOT of LOGTEST of two integers.
(BIT-EQUAL bit-array-1 bit-array-2) is equivalent to
(EVERY #'ZEROP (BIT-XOR bit-array-1 bit-array-2)) [under
the first proposal, only when the arguments are of rank 1.]
BIT-EQUAL differs from EQUAL only when the arguments are of
unequal dimensions.
Rationale:
1,2. Relaxing the requirement that bit arrays must have equal dimensions
was requested by users who had tried to use these operations on sets.
3. Three new functions are added by BIT-ARRAY-FUNCTIONS:ADD because EVERY
only works on vectors, since issue SEQUENCE-FUNCTIONS-EXCLUDE-ARRAYS was
rejected. BIT-ARRAY-FUNCTIONS:NO-NEW-FUNCTIONS includes the minimal
portion of that proposal needed to avoid adding any new functions, while
omitting all the controversial parts.
4. Respecting the fill-pointer of vectors makes the BIT-xxx functions
more consistent with the rest of the language. They can be thought of as
sequence functions for bit-vectors (and sequence functions always respect
the fill-pointer) that have been generalized to work on multidimensional
bit-arrays as well.
5,6,7. The suggestion for compiler optimization is to give users the
confidence that they will get good results when using sequence and
mapping operations on bit vectors. Otherwise we would feel the need to
add additional bit-vector-specific functions to perform these operations
in a way that is optimized and specialized for bit-vectors. Recommending
optimization of a particular way of performing these operations avoids
the problem of each implementation choosing a different idiom to
optimize, resulting in performance problems when porting.
Current practice:
Symbolics Genera 7.2 has something like the first proposal, but only for
bit vectors, not generalized for bit arrays. Genera has some additional
functions (BIT-VECTOR-POSITION, BIT-VECTOR-CARDINALITY, and
BIT-VECTOR-ZERO-P) that aren't really necessary since they are equivalent
to POSITION, COUNT, or EVERY plus a type declaration. The proposal seems
to fit into the rest of Common Lisp better than Genera's current practice.
Symbolics Genera 7.2 does not respect the fill-pointer in BIT-AND.
Cost to Implementors:
Implementing these very efficiently may require some clever hand coding.
Of course the standard cannot mandate any particular level of efficiency
and a simple, low-cost implementation is permissible. Implementing the
compiler suggestions requires keeping track of type declarations in the
compiler, but most compilers already do that. The second proposal
requires slightly more compiler analysis than the first proposal.
A run-time type test and dispatch to code specialized for bit-arrays
could be used instead of compiler analysis, at a small efficiency cost.
Cost to Users:
None, unless some implementation currently violates point 4 and user
programs currently depend on that. It seems quite unlikely that anyone
would depend on BIT-xxx functions to access past the fill-pointer.
Cost of non-adoption:
Less featureful language. Some bit array manipulation will have to be
written in nonportable Lisp code or in C or assembly language.
Performance impact:
None on programs that don't use these features. Negligibly small on
the binary bit-array functions referenced above when array dimensions
are equal. Large improvement for programs that can take advantage of
these features when running in an implementation that optimizes them.
Benefits:
More featureful language.
Esthetics:
More featureful language.
Discussion:
This functionality was suggested on the Common Lisp mailing list
12-Jan-89. The detailed design has evolved from what was suggested and
is greatly simplified.
The loose specification of the result dimensions in points 1 and 2 is to
allow maximum implementation freedom. This is not essential to the
proposal and could be changed to require that the result have the same
dimension as the larger of the two arguments.
It has been suggested that points 6 and 7 should specify some other
predicates to optimize, such as #'PLUSP or (COMPLEMENT #'ZEROP). Moon
doesn't think this is important enough to be worth adding.
∂19-Jun-89 0847 X3J13-mailer Issue: DATA-IO (version 7)
Received: from STONY-BROOK.SCRC.Symbolics.COM by SAIL.Stanford.EDU with TCP; 19 Jun 89 08:47:32 PDT
Received: from KENNETH-WILLIAMS.SCRC.Symbolics.COM by STONY-BROOK.SCRC.Symbolics.COM via INTERNET with SMTP id 612929; 19 Jun 89 11:49:14 EDT
Date: Mon, 19 Jun 89 11:47 EDT
From: David A. Moon <Moon@STONY-BROOK.SCRC.Symbolics.COM>
Reply-To: CL-Cleanup@sail.stanford.edu
Subject: Issue: DATA-IO (version 7)
To: X3J13@sail.stanford.edu
Message-ID: <19890619154723.1.MOON@KENNETH-WILLIAMS.SCRC.Symbolics.COM>
This is a new issue. It arose from an investigation of features
that are plausibly needed but missing from draft ANSI Common Lisp.
This issue seems sufficiently simple and noncontroversial that
I would like to see it on the agenda for the June X3J13 meeting.
Issue: DATA-IO
References: CLtL pp.360, 370, 382
Related issues: none
Category: ADDITION
Edit history: Version 1, 9-May-89, by Moon
Version 2, 10-May-89, by Moon
(clarify ambiguities, add PRINT-UNREADABLE-OBJECT)
Version 3, 18-May-89, by Moon (respond to KMP's comments)
Version 4, 21-May-89, by Moon (almost-final cleanup)
Version 5, 22-May-89, by Pitman (``never say never'')
Version 6, 23-May-89, by Moon (final cleanup)
Version 7, 18-Jun-89, by Moon (more fixes based on
discussion in the cleanup subcommittee)
Problem description:
Storing data in textual form in files, as Lisp expressions, is common
practice but has some pitfalls. Files can be unreadable if #<...> syntax
is written by the printer, or if the reader syntax or package varies
between writing and reading. Files of data intended to be carried from
one Lisp implementation to another can fail to read correctly if
implementation-dependent syntax extensions get used when not intended.
CLtL p.370 recommends that unreadable objects be printed with #<...>
syntax including implementation-dependent information. Now that users
can write their own PRINT-OBJECT methods, a way is needed for such
methods to print this syntax without any implementation-dependent coding.
Proposal (DATA-IO:ADD-SUPPORT):
1a. Add a new variable *PRINT-READABLY*. Add a corresponding keyword
argument :READABLY to WRITE. The default value of *PRINT-READABLY* is
NIL. If *PRINT-READABLY* is true, then printing any object produces a
printed representation that the reader will accept. If this is not
possible, the printer signals an error of type PRINT-NOT-READABLE rather
than using an unreadable syntax such as #<...>. The printed
representation produced when *PRINT-READABLY* is true might or might not
be the same as the printed representation produced when *PRINT-READABLY*
is false.
1b. All methods for PRINT-OBJECT must obey *PRINT-READABLY*. This
includes both user-defined methods and implementation-defined methods.
1c. If *PRINT-LENGTH* (or *PRINT-LEVEL*) and *PRINT-READABLY* are both
true, and a list longer (deeper) than *PRINT-LENGTH* (*PRINT-LEVEL*) is
printed, the printer might ignore *PRINT-LENGTH* (or *PRINT-LEVEL*) or it
might signal PRINT-NOT-READABLE, but it will not print an abbreviated
list.
1d. Printed representations produced when *PRINT-READABLY* is true and
*PRINT-ESCAPE* is false might or might not be readable.
1e. Setting *PRINT-READABLY* to true and *PRINT-ESCAPE* to false might or
might not prevent errors of type PRINT-NOT-READABLE from being signalled.
2. Add a new reader control variable, *READ-EVAL*, whose default value is
T. If *READ-EVAL* is NIL, the #. reader macro signals an error. If
*READ-EVAL* is false and *PRINT-READABLY* is true, any PRINT-OBJECT
method that would output a #. reader macro either outputs something
different or signals an error of type PRINT-NOT-READABLE.
3. Add a new macro:
WITH-STANDARD-IO-SYNTAX &body body [Macro]
Within the dynamic extent of <body>, all reader/printer control
variables, including any implementation-defined ones not specified by
Common Lisp, are bound to values that produce standard read/print
behavior. The values for Common Lisp specified variables are:
*PACKAGE* The USER package
*PRINT-ARRAY* T
*PRINT-BASE* 10
*PRINT-CASE* :UPCASE
*PRINT-CIRCLE* NIL
*PRINT-ESCAPE* T
*PRINT-GENSYM* T
*PRINT-LENGTH* NIL
*PRINT-LEVEL* NIL
*PRINT-PRETTY* NIL
*PRINT-RADIX* NIL
*PRINT-READABLY* T
*READ-BASE* 10
*READ-DEFAULT-FLOAT-FORMAT* SINGLE-FLOAT
*READ-EVAL* T
*READ-SUPPRESS* NIL
*READTABLE* The standard readtable
The values returned by WITH-STANDARD-IO-SYNTAX are the values
of the last body form, or NIL if there are no body forms.
4. Add a new macro:
PRINT-UNREADABLE-OBJECT (object stream &key type identity) [Macro]
&body body
Output a printed representation of <object> on <stream>, beginning with
"#<" and ending with ">". Everything output to <stream> by the <body>
forms is enclosed in the angle brackets. If :type is true, the body
output is preceded by a brief description of the object's type and a
space character. If :identity is true, the body output is followed by
a space character and a representation of the object's identity,
typically a storage address.
If *PRINT-READABLY* is true, PRINT-UNREADABLE-OBJECT signals an error
of type PRINT-NOT-READABLE without printing anything.
The <object>, <stream>, :type, and :identity arguments are all evaluated
normally. :type and :identity default to false. It is valid to omit
the <body> forms. If :type and :identity are both true and there are no
<body> forms, only one space character separates the type and the identity.
The value returned by PRINT-UNREADABLE-OBJECT is NIL.
5. Add a new condition type:
PRINT-NOT-READABLE [Type]
Errors which occur during output while *PRINT-READABLY* is true, as a
result of attempting to output a printed representation that cannot be
read back, should inherit from this type. This is a subtype of ERROR.
The init keyword :OBJECT is supported to initialize the slot containing
the object being printed, which can be accessed using
PRINT-NOT-READABLE-OBJECT.
Examples:
;; Example #1: Reliable Write-Read
(WITH-OPEN-FILE (FILE pathname :DIRECTION :OUTPUT)
(WITH-STANDARD-IO-SYNTAX
(PRINT DATA FILE)))
; ... Later, in another Lisp:
(WITH-OPEN-FILE (FILE pathname :DIRECTION :INPUT)
(WITH-STANDARD-IO-SYNTAX
(SETQ DATA (READ FILE))))
;; Example #2: Use of PRINT-UNREADABLE-OBJECT
;; Note that in this example, the precise form of the output
;; is really implementation-dependent.
(DEFMETHOD PRINT-OBJECT ((OBJ AIRPLANE) STREAM)
(PRINT-UNREADABLE-OBJECT (OBJ STREAM :TYPE T :IDENTITY T)
(PRINC (TAIL-NUMBER OBJ) STREAM)))
(PRINT MY-AIRPLANE)
#<Airplane NW0773 36000123135> ;in Implementation A
;or
#<FAA:AIRPLANE NW0773 17> ;in Implementation B
Rationale:
1. *PRINT-READABLY* is important so that errors involving data with no
readable printed representation are detected when writing the file, not
later on when the file is read.
*PRINT-READABLY* is different from *PRINT-ESCAPE* because output printed
with escapes only has to be generally recognizable by humans, whereas
output printed readably has to be reliably recognizable by computers.
2. Binding *READ-EVAL* to NIL is useful when reading data that came from
an untrusted source, such as a network or a user-supplied data file, to
prevent the #. reader macro from being exploited as a "Trojan horse" to
cause arbitrary forms to be evaluated.
3. Providing the WITH-STANDARD-IO-SYNTAX macro to bind all the variables,
instead of using LET and explicit bindings of the existing variables,
ensures that nothing is overlooked and avoids problems with
implementation-defined reader/printer control variables.
If the user wishes to use a non-standard value for some variable, such as
*PACKAGE* or *READ-EVAL*, it can be bound by LET inside the body of
WITH-STANDARD-IO-SYNTAX. Similarly, if the user dislikes the somewhat
arbitrary choices of values for *PRINT-CIRCLE* and *PRINT-PRETTY*, they
can be bound to the preferred values inside the body.
4. PRINT-UNREADABLE-OBJECT allows user-written PRINT-OBEJCT methods to
adhere to implementation-specific style without requiring users to write
implementation-dependent code.
5. Defining a specific condition type associated with *PRINT-READABLY*
makes it possible for programs to handle the condition and recognize
the offending object.
Current practice:
Symbolics Genera has had these features for many years, except with
different names. For instance, WITH-STANDARD-IO-SYNTAX is named
WITH-STANDARD-IO-ENVIRONMENT and binds *PACKAGE* to a non-standard
package. The proposed new names are better than the Genera names.
Genera's WITH-STANDARD-IO-ENVIRONMENT also disables #., to prevent trojan
horses, since #. could evaluate an arbitrary form. This is particularly
important for network protocols. WITH-STANDARD-IO-SYNTAX does not bind
*READ-EVAL* to NIL, because that would prevent using #. in the printer
for common datatypes, which is current practice in some implementations
for printing PATHNAMEs or RANDOM-STATEs.
In Genera, PRINT-UNREADABLE-OBJECT is called SYS:PRINTING-RANDOM-OBJECT
and takes slightly different arguments. In PCL, PRINT-UNREADABLE-OBJECT
is called PCL:PRINTING-RANDOM-THING.
Cost to Implementors:
Very small, these features are all easy to add. If #. is output by any
system-supplied print methods, they might want to invent a different
syntax, however that is not required by this proposal.
Cost to Users:
None if they don't use the feature. Otherwise just the cost of
supporting *PRINT-READABLY* or using PRINT-UNREADABLE-OBJECT in their
PRINT-OBJECT methods.
Cost of non-adoption:
There will be no reliable, standard way to write data into a file.
Performance impact:
Negligible. Entering WRITE may be slightly slower since there is
one more keyword argument to parse and one more special variable
to bind before calling PRINT-OBJECT.
Benefits:
Data can be written into files reliably without resorting to
implementation-specific programming.
Esthetics:
Mildly improved.
Discussion:
Pitman and Moon support this proposal.
∂19-Jun-89 0851 X3J13-mailer Issue: FLOAT-UNDERFLOW (version 3)
Received: from STONY-BROOK.SCRC.Symbolics.COM by SAIL.Stanford.EDU with TCP; 19 Jun 89 08:51:26 PDT
Received: from KENNETH-WILLIAMS.SCRC.Symbolics.COM by STONY-BROOK.SCRC.Symbolics.COM via INTERNET with SMTP id 612931; 19 Jun 89 11:53:09 EDT
Date: Mon, 19 Jun 89 11:51 EDT
From: David A. Moon <Moon@STONY-BROOK.SCRC.Symbolics.COM>
Reply-To: CL-Cleanup@sail.stanford.edu
Subject: Issue: FLOAT-UNDERFLOW (version 3)
To: X3J13@sail.stanford.edu
Message-ID: <19890619155127.2.MOON@KENNETH-WILLIAMS.SCRC.Symbolics.COM>
This is a new issue. It arose from an investigation of features
that are plausibly needed but missing from draft ANSI Common Lisp.
This issue seems sufficiently simple and noncontroversial that
I would like to see it on the agenda for the June X3J13 meeting.
Issue: FLOAT-UNDERFLOW
References: CLtL p.231
Related issues:LEAST-POSITIVE-SINGLE-FLOAT-NORMALIZATION (not written up),
ERROR-CHECKING-IN-NUMBERS-CHAPTER
Category: ADDITION and CLARIFICATION
Edit history: Version 1, 9-May-89, by Moon (suggested in January, but
the writeup was late)
Version 2, 23-May-89, by Moon (final cleanup for post-CLtL
changes to Common Lisp)
Version 3, 18-Jun-89, by Moon (update based on discussion
within the cleanup subcommittee)
Problem description:
In implementations with denormalized floating point numbers (as in IEEE
floating point), which are closer to zero than any non-zero normalized
floating point numbers, should the LEAST-POSITIVE- and
MOST-POSITIVE-XXX-FLOAT constants be the normalized or denormalized
values? Which is preferred depends on the application. Note that in
IEEE floating point, denormalized results are normally only produced
as a result of underflow.
Also, there is no portable way to control what happens when a floating
point number underflows. Sometimes this is an error, sometimes not.
Indeed there is no mention at all of underflow or overflow in CLtL.
Pending issue ERROR-CHECKING-IN-NUMBERS-CHAPTER does not mention floating
point overflow or underflow. Draft ANSI Common Lisp specifies error
conditions named FLOATING-POINT-OVERFLOW and FLOATING-POINT-UNDERFLOW
but does not specify the circumstances in which they are signalled and
does not provide any way to suppress underflow checking.
Proposal (FLOAT-UNDERFLOW:ADD-CONTROLS):
1. Clarify that the existing LEAST-POSITIVE-XXX-FLOAT and
LEAST-NEGATIVE-XXX-FLOAT constants are literally as defined, and
therefore can be denormalized numbers in implementations that have
denormalized numbers.
2. Add the following constants, whose values are the normalized floating
point numbers closest in value to (but not equal to) zero. In
implementations that don't have denormalized numbers, the values of these
constants are the same as the values of the other constants.
LEAST-NEGATIVE-NORMALIZED-DOUBLE-FLOAT [Constant]
LEAST-NEGATIVE-NORMALIZED-LONG-FLOAT [Constant]
LEAST-NEGATIVE-NORMALIZED-SHORT-FLOAT [Constant]
LEAST-NEGATIVE-NORMALIZED-SINGLE-FLOAT [Constant]
LEAST-POSITIVE-NORMALIZED-DOUBLE-FLOAT [Constant]
LEAST-POSITIVE-NORMALIZED-LONG-FLOAT [Constant]
LEAST-POSITIVE-NORMALIZED-SHORT-FLOAT [Constant]
LEAST-POSITIVE-NORMALIZED-SINGLE-FLOAT [Constant]
3. Add the following macro:
WITHOUT-FLOATING-UNDERFLOW-TRAPS &body body [Macro]
Within the dynamic extent of the body, the result of a floating point
computation which would otherwise underflow is a denormalized number
(if they are supported in the implementation) or zero, whichever is
closest to the mathematical result.
The values of WITHOUT-FLOATING-UNDERFLOW-TRAPS are the values of the
last body form, or NIL if there are no body forms.
4. Clarify that outside the dynamic extent of
WITHOUT-FLOATING-UNDERFLOW-TRAPS, a floating point computation that
underflows should signal an error of type FLOATING-POINT-UNDERFLOW. A
result that can only be represented in denormalized form is considered an
underflow in implementations that support denormalized floating point
numbers.
5. Clarify that a floating point computation that overflows should signal
an error of type FLOATING-POINT-OVERFLOW.
Example: (not portable of course)
(expt 0.1 40) => FLOATING-POINT-UNDERFLOW error
(describe (without-floating-underflow-traps (expt 0.1 40))) =>
1.0e-40 is a single-precision floating-point number.
Sign 0, exponent 0, 23-bit fraction 213302 (denormalized)
Rationale:
The ANSI Common Lisp standard should be compatible with the widely used
IEEE Floating Point standard.
WITHOUT-FLOATING-UNDERFLOW-TRAPS is provided as a macro to allow
implementation flexibility. It could expand into HANDLER-BIND for
FLOATING-POINT-UNDERFLOW, but in most implementations it will probably
expand into implementation-dependent code that sets a hardware mode bit.
Specifying "should signal" rather than "signals" or "might signal" for
floating-point overflows and underflows seems the best balance between
safety and implementation freedom. It wouldn't harm the proposal to
change it to one of the other two phrases.
Current practice:
The proposal exactly matches Symbolics Genera release 7 except for
the names of the conditions.
Lucid Common Lisp 3.0 implements parts 1, 2, 4, and 5 of the proposal.
Instead of point 3 of the proposal, Lucid Common Lisp 3.0 has a macro
(WITH-FLOATING-POINT-TRAPS enable-condition-list disable-condition-list
&body body) that enables and disables a variety of floating-point-related
conditions, a function ENABLED-FLOATING-POINT-TRAPS that returns a list
of condition names, a constant SUPPORTED-FLOATING-POINT-CONDITIONS whose
value is a list of condition names, and several additional condition
names (the exact set of condition names varies, depending on the
hardware).
Cost to Implementors:
Adding the constants and the macro is easy. Since it was never clarified
that floating point underflow is to be detected in safe code, implementors
who had not already implemented that might have to go to some expense.
In the laissez-faire spirit of floating point in Common Lisp, we could
relax the specification and say only that underflow might signal rather
than should signal.
Cost to Users:
None.
Cost of non-adoption:
Each Common Lisp implementation that uses IEEE Floating Point will have
to invent its own way to deal with underflow and denormalized numbers.
Performance impact:
No effect on code optimized for speed rather than safety.
Benefits:
Increased portability and correctness of floating point code.
Esthetics:
Neutral.
Discussion:
Maybe point 3 of the proposal should be replaced by the more complex
feature from Lucid. This would allow re-enabling underflow checking
after it had been disabled, and would allow control over other traps such
as overflow and inexact result. Moon would prefer to keep it simple,
but if others support the more general mechanism, he can accept it.
If the group cannot agree on this, Moon suggests dropping point 3 from
the proposal and passing points 1, 2, 4, and 5.
∂19-Jun-89 0914 X3J13-mailer Issue: MAP-INTO (version 2)
Received: from STONY-BROOK.SCRC.Symbolics.COM by SAIL.Stanford.EDU with TCP; 19 Jun 89 09:14:00 PDT
Received: from EUPHRATES.SCRC.Symbolics.COM by STONY-BROOK.SCRC.Symbolics.COM via CHAOS with CHAOS-MAIL id 612957; 19 Jun 89 12:15:55 EDT
Date: Mon, 19 Jun 89 12:16 EDT
From: David A. Moon <Moon@STONY-BROOK.SCRC.Symbolics.COM>
Reply-To: CL-Cleanup@sail.stanford.edu
Subject: Issue: MAP-INTO (version 2)
To: X3J13@sail.stanford.edu
Message-ID: <19890619161633.3.MOON@EUPHRATES.SCRC.Symbolics.COM>
This is a new issue. It arose from an investigation of features
that are plausibly needed but missing from draft ANSI Common Lisp.
This issue seems sufficiently simple and noncontroversial that
I would like to see it on the agenda for the June X3J13 meeting.
Issue: MAP-INTO
References: none
Related issues: BIT-ARRAY-FUNCTIONS
Category: ADDITION
Edit history: 23-May-89, version 1 by Moon
19-Jun-89, version 2 by Moon (fix arglist)
Problem description:
The function MAP is very useful but can be a source of inefficiency
because it conses the result. Sometimes the user has storage
already allocated in which the result could be stored.
Proposal (MAP-INTO:ADD-FUNCTION):
Add the following function:
MAP-INTO result-sequence function sequence &rest more-sequences [Function]
Destructively modifies the result-sequence to contain the results of
applying function to each element in the argument sequences in turn.
Returns result-sequence.
MAP-INTO differs from MAP in that it modifies an existing sequence
rather than creating a new one.
The arguments result-sequence and each element of sequences can each be
either a list or a vector (one-dimensional array). Note that nil is
considered to be a sequence, of length zero. If result-sequence and
each element of sequences are not all the same length, the iteration
terminates when the shortest sequence is exhausted.
If BIT-ARRAY-FUNCTIONS:NO-NEW-FUNCTIONS passes, then MAP-INTO will
allow result-sequence and each element of sequences to be mappables
all of the same rank.
The function must take at least as many arguments as there are
sequences provided, and at least one sequence must be provided.
If function has side effects, it can count on being called first on all
of the elements with index 0, then on all of those numbered 1, and so
on.
Examples:
(map-into x #'+ x y)
(map-into q #'cons keys vals)
Rationale:
MAP-INTO is a simple way to express reuse of storage that is
stylistically consistent with the rest of Common Lisp.
Current practice:
Symbolics Genera 7.2 implements the proposal.
Cost to Implementors:
Small.
Cost to Users:
None.
Cost of non-adoption:
Small.
Performance impact:
None.
Benefits:
More expressive language.
Esthetics:
User programs won't have to write the above examples as
(loop for xx on x and yy in y do
(setf (car xx) (+ (car xx) yy)))
or something else about equally horrible.
Discussion:
None.
∂23-May-89 1148 CL-Cleanup-mailer Issue: STRING-COERCION (version 2)
Received: from STONY-BROOK.SCRC.Symbolics.COM by SAIL.Stanford.EDU with TCP; 23 May 89 11:48:21 PDT
Received: from KENNETH-WILLIAMS.SCRC.Symbolics.COM by STONY-BROOK.SCRC.Symbolics.COM via INTERNET with SMTP id 599438; 23 May 89 14:29:55 EDT
Date: Tue, 23 May 89 14:34 EDT
From: David A. Moon <Moon@STONY-BROOK.SCRC.Symbolics.COM>
Subject: Issue: STRING-COERCION (version 2)
To: CL-Cleanup@sail.stanford.edu
Message-ID: <19890523183420.0.MOON@KENNETH-WILLIAMS.SCRC.Symbolics.COM>
This is a new issue. This issue seems sufficiently simple and
noncontroversial that I would like to see it on the agenda for the June
X3J13 meeting. Let's use the cleanup subcommittee to test the assertion
that this is a simple and noncontroversial issue. If it's
controversial, let's just drop it, otherwise let's give X3J13 a chance
to vote for or against it.
Issue: STRING-COERCION
References: Strings (pp299-304),
STRING= (p300), STRING-EQUAL (p301), STRING< (p301),
STRING> (p301), STRING<= (p301), STRING>= (p301),
STRING/= (p301), STRING-LESSP (p302), STRING-GREATERP (p302),
STRING-NOT-GREATERP (p302), STRING-NOT-LESSP (p302),
STRING-NOT-EQUAL (p302), STRING-TRIM (p302), STRING-LEFT-TRIM (p302),
STRING-RIGHT-TRIM (p302), STRING-UPCASE (p303), STRING-DOWNCASE (p303),
and STRING-CAPITALIZE (p303).
Related issues: none
Category: CLARIFICATION
Edit history: Version 1, 9-May-89 by Moon
Version 2, 9-May-89 by Pitman (editorial changes)
Problem description:
CLtL is inconsistent about the argument coercion performed by the
referenced functions. Page 299 says that the <string> argument can
be either a symbol or a string. Page 304 says that these functions
effectively call the STRING function, thus accepting a symbol,
a string, or a character.
Neither page lists the set of affected functions explicitly.
Page 304 says that if any other data type is used, an error is
signalled. But some implementations allow other types, such as
pathnames, to be coerced to strings, which page 299 appears to allow
but page 304 appears to forbid. In some implementations these
coercions are under user control via methods for a generic function.
Proposal (STRING-COERCION:MAKE-CONSISTENT):
Specify that the referenced functions perform coercion identical to
the action of the STRING function.
Specify that the STRING function can perform additional implementation
dependent coercions. In all cases the returned value is of type STRING.
Only in the case where no coercion is defined is the STRING function
required to signal an error; in that case, the error is of type TYPE-ERROR.
Examples:
(string-lessp #\a "B") => T
Rationale:
Our choices are to make the coercion identical to the STRING function,
identical to the COERCE function, or different from both of them. The
COERCE function won't coerce non-null symbols to strings, so it is out.
Being consistent with the STRING function seems better than inventing
yet another set of string coercion rules. Removing the ability for the
STRING function to coerce characters to strings would be an incompatible
change, so instead we clarify that the other functions have that ability.
Allowing additional coercions is harmless and consistent with current
practice.
Current practice:
Symbolics Genera follows page 304 except for allowing additional
coercions. Symbolics Cloe follows page 299 except for not allowing
additional coercions.
Cost to Implementors:
Small changes to eighteen functions.
Cost to Users:
None, this is upward-compatible.
Cost of non-adoption:
Inconsistency and confusion about what coercions are allowed.
Performance impact:
None. If these things have to accept symbols, accepting characters
too can't make much difference. The implementation of character
arguments to string functions might cons a string, but this has no
performance impact on programs that don't use the feature.
Benefits:
Consistency.
Esthetics:
Consistency.
Discussion:
None.
∂23-Mar-89 1527 X3J13-mailer issue DEFINE-OPTIMIZER, version 6
Received: from cs.utah.edu by SAIL.Stanford.EDU with TCP; 23 Mar 89 15:26:04 PST
Received: from defun.utah.edu by cs.utah.edu (5.61/utah-2.1-cs)
id AA13550; Thu, 23 Mar 89 14:18:32 -0700
Received: by defun.utah.edu (5.61/utah-2.0-leaf)
id AA12741; Thu, 23 Mar 89 14:18:28 -0700
From: sandra%defun@cs.utah.edu (Sandra J Loosemore)
Message-Id: <8903232118.AA12741@defun.utah.edu>
Date: Thu, 23 Mar 89 14:18:27 MST
Subject: issue DEFINE-OPTIMIZER, version 6
To: x3j13@sail.stanford.edu
There have been a number of small changes made to this writeup.
- The description of DEFINE-OPTIMIZER now says that the optimizer should
return only one value.
- The relationship to INLINE/NOTINLINE declarations has been clarified.
- There have been some minor clarifications to wording.
- The discussion section has been expanded.
Forum: Compiler
Issue: DEFINE-OPTIMIZER
References: Issue SYNTACTIC-ENVIRONMENT-ACCESS
Category: ADDITION
Edit history: 28-Sep-88, Version 1 by Pitman
10-Mar-89, Version 2 by Pitman (clarifications, new example),
10-Mar-89, Version 3 by Pitman & Loosemore
11-Mar-89, Version 4 by Pitman
13-Mar-89, Version 5 by Loosemore (discussion)
22-Mar-89, Version 6 by Loosemore (more discussion)
Status: Ready for release
Problem Description:
Often a general functional interface could be bypassed given explicit
knowledge of the arguments passed. This may happen when the arguments
are constant (or otherwise inferrable), an argument type is known (eg,
due to use of THE or DECLARE), or when some particular pattern of
optional, rest or keyword arguments is apparent.
Most implementations provide internally for optimization of generalized
function call interfaces to more specialized ones, but such an
optimization facility is not provided to Common Lisp users.
The absence of this facility in a portable fashion means that some
CL programs run slower than they need to in some implementations, or
else that some operators that should be implemented as functions end
up getting implemented as macros to assure needed efficiency.
Proposal (DEFINE-OPTIMIZER:NEW-FACILITY):
Introduce a facility for declaring compiler optimizations.
DEFINE-OPTIMIZER name arglist {declaration}* {form}* [Macro]
Defines a compiler optimizer for a function named NAME. The ARGLIST,
DECLARATIONS, and FORMS are treated exactly like the arglist,
declarations, and forms in a DEFMACRO. (The arglist may include
&ENVIRONMENT and &WHOLE.)
The argument NAME must name a function which has been previously
defined. The effects of defining an optimizer for a locally or
globally defined macro, a locally defined function, or a special
form are undefined.
When the optimizer is invoked, the forms are executed in the context
of bindings specified by the arglist as an implicit PROGN. The
optimizer should return a form which is preferable to evaluate instead
of the indicated call, or NIL to decline to optimize. If an optimizer
wishes to optimize into a form whose result is NIL, it should return
(QUOTE NIL). The resulting form should be careful to preserve the
semantics (including order-of-evaluation) of the original function
call.
If a call to DEFINE-OPTIMIZER appears at top-level in a file
being processed by the file compiler, it also makes the optimizer
known at compile-time (similar to the way DEFMACRO makes a macro
definition known to the compiler).
OPTIMIZE-EXPRESSION-1 form env [Function]
Similar to MACROEXPAND-1. Invokes the optimizers for the top level of
FORM, but does not iterate on the result. Returns two values:
RESULT and CHANGED-P.
Note: If an optimizer declines to optimize,
OPTIMIZE-EXPRESSION-1 hides the fact by returning FORM,NIL
rather than NIL,NIL.
OPTIMIZE-EXPRESSION form env [Function]
Iterates calling OPTIMIZE-EXPRESSION-1 until the CHANGED-P result
is NIL. Two values are returned: RESULT and CHANGED-P.
An implementation must save optimizer definitions created by
DEFINE-OPTIMIZER in case OPTIMIZE-EXPRESSION is attempted, but is
not actually required to call OPTIMIZE-EXPRESSION itself. Interpreters,
for example, may choose to just call the unoptimized form.
Special forms such as FLET and MACROLET that create local functional
definitions shadow not only functions and their SETF methods,
but also their optimizers. No portable facility is provided for creating
locally defined optimizers.
The effect of defining optimizations for functions in the LISP package
is not defined. (In some implementations, this would clobber or conflict
with existing advice that may be of higher quality.)
The editor is advised that a non-binding style note such as the
following would also be appropriate:
In general, it is poor style for a programmer to define optimizers for
functions that he does not maintain. This is because the correct
implementation of an optimizer for a function usually depends on an
understanding of the internals of that function. As such, a function
definition and any optimizers should be maintained as a unit so that
they can changes in either can be synchronized as appropriate with the
other.
If a function that has an optimizer function is declared INLINE,
the optimizer has precedence. If a function that has an optimizer
function is declared NOTINLINE, the application of the optimizer
function by OPTIMIZE-EXPRESSION and OPTIMIZE-EXPRESSION-1 is
inhibited.
Example:
;; These examples are taken literally from the Macsyma sources,
;; modified only to change DEFOPT to DEFINE-OPTIMIZER. The comments
;; were specially written for the X3J13 audience.
;; M+ is adds a Macsyma expression to another Macsyma expression.
;; The Macsyma internal representation for the sum of X and Y is
;; ((MPLUS) X Y). A all the real work is done by SIMPLIFY, which
;; reduces the expression as needed necessary. However, SIMPLIFY
;; is very complicated, and considerable speed can be gained by
;; entering it at specific known places.
(DEFUN M+ (&REST TERMS)
(PROTECT-&REST-VARIABLE TERMS)
(SIMPLIFY `((MPLUS) ,@TERMS)))
(DEFINE-OPTIMIZER M+ (&REST TERMS)
(COND ((= (LENGTH TERMS) 2) `(ADD2* ,@TERMS))
(T `(ADDN (LIST ,@TERMS) NIL))))
;; M- negates a Macsyma expression, or substracts two Macsyma
;; expressions. Once you figure out which of the two operations is
;; to be done, the problem is similar to that of M+ above. However,
;; often the decision can be made at compile time. In this case,
;; INLINE functions would have worked ok, except that not all
;; implementations do inlining, and even those that do may fail to
;; recognize that EXP2 being NIL means that a test can be eliminated
;; or dead code can be eliminated. Using optimizers is far more
;; likely to be useful in practice.
(DEFUN M- (EXP1 &OPTIONAL (EXP2 NIL EXP2P))
(IF (NOT EXP2P)
(M--INTERNAL-NEGATE EXP1)
(M--INTERNAL-SUBTRACT EXP1 EXP2)))
(DEFINE-OPTIMIZER M- (EXP1 &OPTIONAL (EXP2 NIL EXP2P))
(IF (NOT EXP2P)
`(M--INTERNAL-NEGATE ,EXP1)
`(M--INTERNAL-SUBTRACT ,EXP1 ,EXP2)))
Rationale:
Many large portable applications expect such a facility on an
implementation-specific basis. Others would use one if available.
Even if implementations don't use the provided optimizers primitively,
user macros and code-walkers can invoke them, so the facility wouldn't
be completely useless even in those implementations.
The rationale for giving optimizers precedence over INLINE declarations
is that the optimizer can look for special patterns in the arguments,
and defer to the inline if it doesn't find them.
Current Practice:
Symbolics Genera provides an optimizer facility which is more elaborate
but not fundamentally incompatible with this facility.
Many (if not most) serious implementations provide a similar facility.
For example, Lucid provides "compiler macros" which serve the same
purpose.
Cost to Implementors:
Since the implementation is not required to use this facility, the
cost of providing the proposed support is very small.
Cost to Users:
None. This change is upward compatible.
Cost of Non-Adoption:
Portable code would be slower than necessary in some situations.
Benefits:
Some existing non-portable code could become portable.
Aesthetics:
Providing a separate optimizer definition from a main function definition
makes a possibility that the optimizer and main function could drift out
of synch. However, most places where this gets used in the first place
are places where speed is of paramount importance and the programmer is
willing to invest effort in maintaining things correctly and to accept the
risk of lossage if s/he fails.
This is a fairly clean and simple extension which adds significant
power to the compiler.
Discussion:
Pitman strongly supports this proposal, the design of which is modeled
directly after that which has been used in Macsyma for many years.
Information about argument type can come from two different sources:
THE and declarations (via PROCLAIM or DECLARE). The former information
is portably accessible, the latter is not. While a separate proposal
(SYNTACTIC-ENVIRONMENT-ACCESS) for allowing program access to type
declarations would be make this facility more useful, it is still
quite useful without it, as the examples from Macsyma illustrate.
Some implementations provide a way to provide more than one optimizer
for the same function. A multiple optimizer facility can be written
in terms of this simpler facility and vice versa, so the simpler of
the two facilities is proposed here.
Some people have suggested that they would like to see a pattern
matching facility integrated into this facility. The design of a
facility that would satisfy everyone would take a lot of time and
effort. At this point, there is no chance that the design of such a
facility would occur in time for acceptance into the standard.
The choice is this or nothing. Pitman thinks the language is much
better off with some form of optimization support than none.
David Moon says:
I'm not a fan of documentation strings, but shouldn't DEFINE-OPTIMIZER
allow them? Was their omission accidental or intentional?
It was probably accidental, but it's hard to say what should be done
with them if they are supported. Presumably the function already
would have its own FUNCTION documentation. Should the DOCUMENTATION
function be extended to recognize OPTIMIZER as a doc-type symbol?
Loosemore says:
Although I don't really think this is an essential feature to include
in the standard, I don't have any strong objection to adding it. If
people think it's a good idea to provide a standard interface for this
kind of thing, this is a good proposal for doing it -- it's fairly
simple, doesn't introduce any radically new ideas, and is general
enough to allow alternate interfaces (such as the pattern matcher) to
be layered on top of it.
From Barry Margolin:
While I like the proposal in general, I don't think it's appropriate to
add this to the language at this time. If most Lisp vendors are in
favor of it, though, my objection is pretty weak. But there's still the
editorial issue of adding it to the standard. I don't really think it's
worth it for the first version of the standard.
Also, I don't see a whole lot of value in portable optimizers. Yes, the
Macsyma example is a good one, but the real value of optimizers comes
when they translate into calls to extra fast, internal functions.
Portable optimizers can't do this, and non-portable optimizers don't
need to be defined using a portable mechanism.
From Kent Pitman:
I believe this claim is unsubstantiated and unsubstantiable. In many
implementations, internal functions have no special property that user
programs do not. In some cases, that makes the optimizers that much
more important since most internal functions run a constant factor
faster, but do not have any algorithmic leverage over user programs.
Optimizers are potentially able to do much better than built-in
optimizations because they can use domain-specific information that
is beyond the power of even the proverbial SCC (Sufficiently Clever
Compiler).
Optimizers have been around for a -long- time. They are not new
technology. If we cannot adopt at least this much this time, I see
no reason why for CL 2000 we won't have the exact same arguments
raised and we -still- won't get anything. On the other hand, if we
adopt them now we get years of field testing, and next time there
will be a lot of users with suggestions about how to improve them.
Some progress must be made incrementally -- but no progress is made
if the increment is zero.
The risks are very low. This proposal already says the optimizer
function has to be semantics-preserving, and that it might never be
called. It's hard to see how that can go wrong.
For so little cost and so much potential gain, I think it is worth any
associated risk.
From Robert Krajewski:
I think a portable optimizer definer is a fine idea. It's especially
useful for authors of Common Lisp-embedded subsystems that offer safe
access to their data structures in a development environment, but who
also wish to produce fast code for delivery. In such cases, an
optimizer should only run when unsafe code is desired.
From Richard Gabriel:
I oppose this proposal. First, optimization is rarely something that
can be done portably. Using a name like define-optimizer gives the
impression that something will be done more optimally, and maybe it
won't.
Second, it appears that this functionality is isomorphic to macros,
except possibly macros that are only in effect during compilation.
Third, it seems to solve a problem that is addressed by all the various
abstraction mechanisms around already.
Fourth, it is part of a trend I will call ``featherbedding'', which I
will use in my messages to refer to adding comfortable features to
Common Lisp that are redundant or not strictly necessary.
From Dick Waters:
I would like to say that I think that compiler optimizers are an
extremely good idea---right up there with the best of the ideas ever
presented to the committee.
I really hate having to make things macros for trivial reasons,
because this blocks you from funcalling them and using them as
arguments to MAPCAR REDUCE etc. If this mechanism were in Common Lisp
I would use it all the time. I bet it would cover a significant chunk
of what I use macros for.
To be more specific, there are a number of places where such compiler
optimizers would be of HUGE benefit in my portable implementation of
SERIES. In particular, they would be an appropriate framework in
which to state the whole thing. Now, since I have to do it all with
macros, a number of things that should, by every right, be functions
have to be macros instead. This in fact makes it impossible to make an
implementation of what I really want.
-------
∂12-Dec-88 1434 X3J13-mailer Issue: PROCLAIM-LEXICAL (Version 9)
Received: from Xerox.COM by SAIL.Stanford.EDU with TCP; 12 Dec 88 14:27:29 PST
Received: from Semillon.ms by ArpaGateway.ms ; 12 DEC 88 14:10:50 PST
Date: 12 Dec 88 14:08 PST
Sender: masinter.pa@Xerox.COM
To: x3j13@sail.stanford.edu
Subject: Issue: PROCLAIM-LEXICAL (Version 9)
From: cl-cleanup@sail.stanford.edu
reply-to: cl-cleanup@sail.stanford.edu
cc: masinter.pa@Xerox.COM
line-fold: no
Message-ID: <881212-141050-5132@Xerox>
!
Issue: PROCLAIM-LEXICAL
References: variables (p55), scope/extent (p37), global variables (p68),
declaration specifiers (p157)
Category: CLARIFICATION/ADDITION
Edit history: Version 2 by Rees 28-Apr-87
Version 3 by Moon 16-May-87
Version 4 by Masinter 27-Oct-87
Version 5 by Masinter 14-Nov-87
Version 6 by Pitman 15-Sep-88
(major revision, for review by Jonathan Rees and Jeff Dalton)
Version 7 by Pitman 24-Sep-88
(minor revisions based on comments from Rees and Dalton)
Version 8 by Pitman 06-Oct-88 (merge recent discussion)
Version 9 by Masinter 8-Dec-88 (make JonL's changes)
Problem Description:
Although local variables in Common Lisp may be `special' or `lexical,'
global variables (with the exception of named constants) may currently
only be `special.'
The Scheme language permits free variable references to refer to global
bindings. Their experience suggests that such usage would be useful to
the Common Lisp community. The absence of such a facility in Common Lisp
is a barrier both culturally (to the sharing of ideas) and technically
(to the sharing of code).
SPECIAL proclamations are uncontrollably pervasive. There is no way
to locally override or globally undo a SPECIAL proclamation.
Background/Analysis:
Variable evaluation may be viewed in Common Lisp as a search through
a set of environments to find a binding, and then the dereferencing of
that binding. The environments with which Common Lisp deals are
Lexical (L), Dynamic (D), and Global (G).
A SPECIAL declaration for a variable amounts to a request that the
variable be resolved by searching first the Dynamic and then the Global
environment (DG).
As currently described in CLtL, lexical variable reference searches
only the Lexical environment (L).
Because undeclared free variables in the interpreter are implicitly
declared SPECIAL by most (perhaps all) implementations, this amounts
to a search of Lexical, Dynamic, and Global (LDG). However, the
accompanying warnings in many implementations make it clear that this
behavior is not intended to be taken seriously.
Constants are looked up solely in the Global environment (G). They
have other properties as well, of course.
In the Scheme language, the default lookup is first Lexical, then
Global (LG). Providing compatibility for Scheme code is, and more
generally for a Scheme working style is therefore difficult because
Common Lisp does not provide the LG search style.
The issue of whether a variable can be assigned is orthogonal.
The issue of whether a variable can be bound and, if it can be, which
environment is used for the new binding is orthogonal.
Proposal (PROCLAIM-LEXICAL:LG):
Provide a new declaration (and proclamation) called LEXICAL which implies
LG lookup. That is, variables declared LEXICAL would be looked up first
in the lexical environment (L) and then in the global environment (G)
if not found in the lexical.
Clarify that a dynamic binding of a variable creates a new binding
in the dynamic environment (D) leaving the global environment (G)
unaffected.
Clarify that special variable access does DG lookup. That is,
variables declared SPECIAL would be looked up first in the dynamic
environment (D) and then in the global environment (G) if not found
in the dynamic one. Further clarify that SYMBOL-VALUE does DG lookup.
Define that a lexical binding of a variable creates a new binding
in the lexical environment (L), leaving the global environment (G)
and the dynamic environment (D) unaffected.
Note that an assignment to a variable which is bound in the global
environment (G) will affect lexical (LG) lookups for which there is
no lexical (L) binding and dynamic (DG) lookups for which there is
no dynamic (D) binding.
Note that these restrictions describe an abstract model, not a
concrete implementation. An implementation may still choose to
implement dynamic binding as either deep or shallow, but some
searching may be necessary to find the global cell in shallow bound
implementations [unless dynamic binding has been forbidden for
that variable].
Like SPECIAL declarations (and unlike type declarations),
compilers and interpreters would be required to notice and
respect LEXICAL declarations.
Examples:
#1: (proclaim '(lexical x))
(setq x 1)
(defun f (fn) (list x (funcall fn)))
(defun g (fn)
(let ((x 2))
(declare (special x))
(funcall fn #'(lambda () x))))
(g #'f) => (1 2)
#2: ; Warning: It is unlikely that any serious program would
; be written in so obscure a manner as this example.
; This just tests the fringe cases.
(proclaim '(lexical x))
(proclaim '(special y))
(setq x 1 y 2)
(defun tst ()
(let ((x 3) (y 4))
(locally (declare (special x) (lexical y))
(list x y
(funcall (let ((x 5) (y 6))
#'(lambda () (list x y))))))))
(tst) => (1 2 (5 4))
If the results of this example confuse you, keep in mind
that the results of code like this would be somewhat
confusing no matter what the chosen semantics because the
code itself is far from perspicuous.
An explanation of this behavior, which some people find less
than intuitive because of the bizarre choice of constructs:
X gets bound lexically to 3 because X is [pervasively]
proclaimed LEXICAL.
Y gets bound specially to 4 because Y is [pervasively]
proclaimed SPECIAL.
Reference style for name X is changed to SPECIAL, making
lexical X=3 invisible.
Reference style for name Y is changed to LEXICAL, making
dynamic Y=4 invisible.
Global X=1 and global Y=2 are first two elements of list.
X gets bound lexically to 5 because X is [pervasively]
proclaimed LEXICAL.
Y gets bound specially to 6 because Y is [pervasively]
proclaimed SPECIAL.
Closure is returned, capturing [lexical] X=5 but not
[special] Y=6.
Dynamic binding of Y to 6 disappears, dynamic binding
of Y to 4 reverts.
Closure is funcalled, returning captured X=5 and dynamically
active Y=4 in a list which becomes third list element.
Rationale:
This mechanism provides a simple and straightforward answer to
the problems stated above.
Current Practice:
Probably no one implements this.
Cost to Implementors:
A fair amount compiler work would probably be needed. Some compilers
may have hooks for most of this already laying around, but some may not.
Note well that this proposal does not require separate global lexical
and dynamic cells, so the data storage layout of Lisp need not change.
Moon says...
I have now thought of an efficient way to do this on Lisp machines,
using invisible pointers, and another efficient way to do it on
stock hardware, using one extra instruction on every global
reference of one or the other sort, plus a few extra instructions
in SPECIAL binding and unbinding. Given that, I no longer object
to the proposal as unimplementable.
It doesn't just require a few compiler changes, it requires some
reimplementation of the representation of global variables, with
concomitant changes to the compiler, the loader, the interpreter,
and probably the debugger. Every symbol now potentially has two
values accessible from the interpreter (the current SPECIAL and
the global LEXICAL) and you need the corresponding new data
structure to keep track of that.
Rees suggests...
In shallow-bound implementations, implementors may have to add a
small run-time routine that searches the dynamic saved-binding
stack to look for the global value in the case where the variable
has been dynamically bound. One might want a bit (or a count)
somewhere (perhaps in the symbol itself) to speed up the common
case of access to a global binding of a variable that hasn't been
dynamically bound; without some kind of optimization, you have to
search the whole saved-binding stack on every reference to a
free [lexical] variable.
While naively you might think you'd incur the cost of clearing the
valid bit on every dynamic binding (not acceptable), in actuality
the bit is a static property of programs (PROGV excepted). So the
only places you ever need to clear FOO's valid bit are in PROGV,
in the interpreter, and when FASLOADing code that contains a compiled
dynamic binding of FOO.
Cost to Users:
For the most part, this change is upward compatible.
Some code-walking tools would have to change.
Cost of Non-Adoption:
It would continue to be difficult to share code with Scheme.
New CL users coming from the Scheme community would be confused by
their sometimes inability to map what they know about variable binding
into the CL model of variable binding.
Some interesting native CL applications would be impossible to write
in a syntactically convenient style.
Benefits:
Enhanced flexibility of expression.
Rationalization of the semantics of dynamic variables.
Aesthetics:
Improved appeal to a certain sector of the programming community.
Discussion:
Rees points that it is an oversimplification to describe Scheme's
binding simply as LG since they have no Dynamic environment and
there is no way to distinguish LG and LDG. However, the reasons he
prefers LG are:
1. It's nice for readability and understandability to have a
declaration which tells you that a variable will not be
dynamically bound.
2. It's nice for performance in deep-bound implementations to have a
declaration that says that no search will be needed.
Of course, he notes, there could be a counter-argument to item 2
(in favor of LDG) in order to prefer shallow bound implementations,
but that still would not defeat the argument in item 1. Rees believes
that LG is slightly preferrable, but that LDG would be essentially
adequate for most of his needs.
Pitman supports PROCLAIM-LEXICAL:LG and believes that giving LDG the
name LEXICAL would be a serious mistake, leaving open the door for
program bugs due to accidental binding of variables presumed by the
programmer not to be bound. If someone (Moon?) seriously wanted LDG
type variables in addition to LG variables (under a name other than
LEXICAL), Pitman would not object.
Dalton expressed support for PROCLAIM-LEXICAL:LG (Version 6).
He observes that another reason for opposing LDG is that it suggests
the possibility that someone might want DLG. LG is simpler and still
accomplishes the stated purpose. He adds ``I would like to be able
to explain the global environment as a sort of giant, extensible
LET abound everything. This proposal seems to get fairly close.''
It would be possible to submit a proposal for a GLOBAL (G) declaration
under separate cover if anyone (Xerox?) was interested. Pitman thinks
this would be an interesting idea. Dalton points out, however, that
already with this proposal there is enough power to at least deal with
globals -- albeit circuitously. For example, to reference a global
variable X, one could write subroutines such as:
(defun global-x () (declare (lexical x)) x)
(defun set-global-x (value) (declare (lexical x)) (setq x value))
Eg, consider:
(defun f (x) (+ (global-x) x))
In principle, we could imaging saying that free variables should be
lexical by default, but that would only reduce error checking to no
good end. To be really useful, this proposal will need to be followed
by a proposal for primitives analogous to DEFVAR and/or DEFPARAMETER
but for lexical variables. However, since arguments over syntax are
likely to have plenty of issues of their own, we've separated this
proposal for primitive functionality from issues of syntax which
can be dealt with separately once this is passed.
Moon expressed concerns about the efficiency issues but after
thinking about it for a while convinced himself that this is
efficiently implementable both on stock and special purpose hardware.
JonL expressed concerns about the last-minute nature of this change,
which he sees as untested. This concern applies to the mixin of
the dynamic environment implicit in the LDG proposal.
Dalton suggests that an alternative solution to the speed issue
might be possible to obtain by restricting a particular variable to
be either LEXICAL or SPECIAL but not both.
Dalton points that even if people don't like the details here, there
must be a better fallback solution than "do nothing". Pitman agrees
heartily.
∂21-Jun-89 1507 CL-Compiler-mailer Re: Issue: COMPILER-DIAGNOSTICS
Received: from STONY-BROOK.SCRC.Symbolics.COM by SAIL.Stanford.EDU with TCP; 21 Jun 89 15:06:58 PDT
Received: from EUPHRATES.SCRC.Symbolics.COM by STONY-BROOK.SCRC.Symbolics.COM via CHAOS with CHAOS-MAIL id 614583; 21 Jun 89 14:44:27 EDT
Date: Wed, 21 Jun 89 14:44 EDT
From: David A. Moon <Moon@STONY-BROOK.SCRC.Symbolics.COM>
Subject: Re: Issue: COMPILER-DIAGNOSTICS
To: Sandra J Loosemore <sandra%defun@cs.utah.edu>
cc: CL-Compiler@sail.stanford.edu, Moon@STONY-BROOK.SCRC.Symbolics.COM
In-Reply-To: <8906192134.AA26815@defun.utah.edu>
Message-ID: <19890621184458.6.MOON@EUPHRATES.SCRC.Symbolics.COM>
Date: Mon, 19 Jun 89 15:34:55 MDT
From: sandra%defun@cs.utah.edu (Sandra J Loosemore)
If you have something specific to propose, I won't object to reopening
this issue. At the moment I'm feeling too lazy to do anything on it
myself, though.
Okay, a proposed modified writeup is in this message. I hope I started
from the correct version. I'll let you decide whether you want to put
this on the agenda.
Forum: Compiler
Issue: COMPILER-DIAGNOSTICS
References: CLtL p. 438-439, 62, 69, 160, 161
Condition System, Revision #18
S:>KMP>cl-conditions.text.34
Issue GC-MESSAGES
Issue RETURN-VALUES-UNSPECIFIED
Issue COMPILER-VERBOSITY
Issue CONDITION-RESTARTS
Category: CLARIFICATION, ENHANCEMENT
Edit History: V1, 15 Oct 1988, Sandra Loosemore
V2, 19 Oct 1988, Sandra Loosemore (minor fixes)
V3, 25 Oct 1988, Sandra Loosemore (input from Pitman & Gray)
V4, 01 Nov 1988, Sandra Loosemore (fix typos)
V5, 15 Dec 1988, Dan L. Pierson (new condition types)
V6, 15 Dec 1988, Sandra Loosemore (additions, fix wording)
V7, 16 Dec 1988, Dan L. Pierson (minor cleanup)
V8, 07 Jan 1989, Sandra Loosemore (expand discussion)
V9, 26 Jan 1989, Sandra Loosemore (simplify)
V10, 22 Mar 1989, Sandra Loosemore (error terminology)
V11, 11 Apr 1989, Kent Pitman (changes per X3J13)
V12, 21-Jun-89, Moon (changes to point 4 only: return a
status value from COMPILE also, make the status
value provide more detail)
Status: Passed V11
Problem Description:
It is unclear whether various diagnostics issued by the compiler are
supposed to be true errors and warnings, or merely messages.
In some implementations, COMPILE-FILE handles even serious error
situations (such as syntax errors) by printing a message and then
trying to recover and continue compiling the rest of the file, rather
than by signalling an error. While this user interface style is just
as acceptable as invoking the debugger, it means that a normal return
from COMPILE-FILE does not necessarily imply that the file was
successfully compiled.
Many compilers issue warnings about programming style issues (such as
binding a variable that is never used but not declared IGNORE).
Sometimes these messages obscure warnings about more serious problems,
and there should be some way to differentiate between the two. For
example, it should be possible to suppress the style warnings.
Also, neither CLtL nor issue RETURN-VALUES-UNSPECIFIED states what the
return value from COMPILE-FILE should be.
Proposal COMPILER-DIAGNOSTICS:USE-HANDLER:
(1) Introduce a new condition type, STYLE-WARNING, which is a subtype
of WARNING.
(2) Clarify that ERROR and WARNING conditions may be signalled within
COMPILE or COMPILE-FILE, including arbitrary errors which may
occur due to compile-time processing of (EVAL-WHEN (COMPILE) ...)
forms or macro expansion.
Considering only those conditions signalled -by- the compiler (as
opposed to -within- the compiler),
(a) Conditions of type ERROR may be signalled by the compiler in
situations where the compilation cannot proceed without
intervention.
Examples:
file open errors
syntax errors
(b) Conditions of type WARNING may be signalled by the compiler in
situations where the standard explicitly states that a warning must,
should, or may be signalled; and where the compiler can determine
that a situation with undefined consequences or that would cause
an error to be signalled would result at runtime.
Examples:
violation of type declarations
SETQ'ing or rebinding a constant defined with DEFCONSTANT
calls to built-in Lisp functions with wrong number of arguments
or malformed keyword argument lists
referencing a variable declared IGNORE
unrecognized declaration specifiers
(c) The compiler is permitted to signal diagnostics about matters of
programming style as conditions of type STYLE-WARNING. Although
STYLE-WARNINGs -may- be signalled in these situations, no
implementation is -required- to do so. However, if an
implementation does choose to signal a condition, that condition
will be of type STYLE-WARNING and will be signalled by a call to
the function WARN.
Examples:
redefinition of function with different argument list
calls to function with wrong number of arguments
unreferenced local variables not declared IGNORE
declaration specifiers described in CLtL but ignored by
the compiler
(3) State that both COMPILE and COMPILE-FILE are allowed to establish
a default condition handler. If such a condition handler is
established, however, it must first resignal the condition to give any
user-established handlers a chance to handle it. If all user error
handlers decline, the default handler may handle the condition in an
implementation-specific way; for example, it might turn errors into
warnings.
(4) Specify that COMPILE and COMPILE-FILE return two values. The
first value from COMPILE is the compiled function. The first value
from COMPILE-FILE is the truename of the output file, or NIL if the
file could not be created. The second value is one of the following
symbols, to indicate the success or failure of the compilation:
ERROR if a condition of type ERROR was detected by the compiler's
default error handler but not handled by a user error handler
when the condition was resignalled. So even if the error
were turned into a warning by the default handler, it would
still count as an error for this purpose.
WARNING if there was no error, and a warning was issued by the
compiler in a situation where the standard explicitly states
that a warning must, should, or may be signalled; or where
the compiler can determine that a situation with undefined
consequences or that would cause an error to be signalled
would result at runtime.
STYLE-WARNING if there was no error or warning, but there was a
style warning as defined in point 2c.
NIL if there were no errors, warnings, or style warnings.
Rationale:
Introducing the STYLE-WARNING condition allows handlers to distinguish
between potentially serious problems and mere kibitzing on the part of
the compiler.
Requiring any condition handlers established by the compiler to resignal
the condition before proceeding with any implementation-specific action
gives user error handlers a chance to override the compiler's default
behavior. For example, the user error handler could invoke a restart
such as ABORT or MUFFLE-WARNING.
Requiring the compiler to handle the ABORT restart reflects what
several implementations already do (although probably not using this
mechanism). The intent of the wording is to allow an implementation
to abort the entire compilation if it is not feasible to abort a
smaller part.
Requiring a second success-p value to be returned from COMPILE-FILE
gives the user some indication of whether there were serious problems
encountered in compiling the file.
Test Case/Example:
Here is an example of how COMPILE-FILE might set up its condition
handlers. It establishes an ABORT restart to abort the compilation
and a handler to take implementation-specific action on ERROR
conditions. Note that INTERNAL-COMPILE-FILE may set up additional
ABORT restarts.
(defvar *output-file-truename* nil)
(defun compile-file (input-file &key output-file)
(let ((*output-file-truename* nil)
(errors-detected nil))
(with-simple-restart (abort "Abort compilation.")
(handler-bind ((error #'(lambda (condition)
(setq errors-detected t)
(signal condition)
...)))
(internal-compile-file input-file output-file)))
(values *output-file-truename*
errors-detected)))
Current Practice:
No implementation behaves exactly as specified in this proposal.
In VaxLisp, COMPILE-FILE handles most compile-time errors without
invoking the debugger. (It gives up on that top-level form and moves on
to the next one.) Instead of signalling errors or warnings, it simply
prints them out as messages.
In Lucid Common Lisp, COMPILE-FILE invokes the debugger when it encounters
serious problems. COMPILE-FILE returns the pathname for the output file.
Symbolics Genera usually tries to keep compiling when it encounters errors;
so does Symbolics Cloe.
On the TI Explorer, the compiler tries to catch most errors and turn
them into warnings (except for errors on opening a file), but the user
can change special variable COMPILER:WARN-ON-ERRORS to NIL if he wants
to enter the debugger on an error signalled during reading, macro
expansion, or compile-time evaluation. The true name of the output
file is returned as the first value. A second value indicates whether
any errors or warnings were reported.
IIM Common Lisp's compiler handles errors using a resignalling mechanism
similar to what is described here.
Cost to implementors:
The cost to implementors is not trivial but not particularly high. This
proposal tries to allow implementations considerable freedom in what
kinds of conditions the compiler must detect and how they are handled,
while still allowing users some reasonably portable ways to deal with
compile-time errors.
Cost to users:
This is a compatible extension. This proposal may cause users to see
some small differences in the user interface to the compiler, but
implementations already vary quite widely in their approaches. Some
users will probably have to make some minor changes to their code.
Adding the STYLE-WARNING type may cause conflicts with programs
already using that name.
Benefits:
Users are given a way to detect and handle compilation errors, which
would simplify the implementation of portable code-maintenance
utilities. The behavior of the compiler in error situations is made
more uniform across implementations.
Discussion:
The issue of whether the compiler may print normal progress messages
is discussed in detail in a separate issue, COMPILER-VERBOSITY.
Explicit calls to ERROR don't really justify warnings to be signalled
at compile-time, but we assume implementors have some common sense
about when it's appropriate to do so.
Proposal CONDITION-RESTARTS:PERMIT-ASSOCIATION would make it illegal
for conditions to be resignalled. If that proposal is accepted, the
wording here would have to be changed to indicated that the compiler's
condition handler makes a copy of the condition and signals that.
Moon says:
I think [requiring the ABORT restart to be handled] is wrong. The only
documentation of the ABORT restart that I could find says
The purpose of the ABORT restart is generally to allow return to the
innermost ``command level.''
I agree with this, and I believe it means that it is wrong for any
function other than one that establishes a read-eval-print loop or
a command-level to establish an ABORT restart. It would be useful
to have some restart that aborts a portion of the compilation, but
it should be given some other name.
∂15-Jun-89 0824 X3J13-mailer issue CLOS-MACRO-COMPILATION, version 4
Received: from cs.utah.edu by SAIL.Stanford.EDU with TCP; 15 Jun 89 08:24:39 PDT
Received: from defun.utah.edu by cs.utah.edu (5.61/utah-2.1-cs)
id AA14829; Thu, 15 Jun 89 09:25:02 -0600
Received: by defun.utah.edu (5.61/utah-2.0-leaf)
id AA23848; Thu, 15 Jun 89 09:24:59 -0600
Date: Thu, 15 Jun 89 09:24:59 -0600
From: sandra%defun@cs.utah.edu (Sandra J Loosemore)
Message-Id: <8906151524.AA23848@defun.utah.edu>
To: x3j13@sail.stanford.edu
Subject: issue CLOS-MACRO-COMPILATION, version 4
Reply-To: cl-compiler@sail.stanford.edu
This version incorporates Gregor's amendment from the last meeting.
Forum: Compiler
Issue: CLOS-MACRO-COMPILATION
References: CLOS chapters 1 & 2 (88-002R)
CLOS chapter 3 (89-003)
Issue COMPILE-FILE-HANDLING-OF-TOP-LEVEL-FORMS
Issue DEFINING-MACROS-NON-TOP-LEVEL
Category: CLARIFICATION
Edit History: V1, 10 Mar 1989, Sandra Loosemore
V2, 13 Mar 1989, Sandra Loosemore
V3, 21 Mar 1989, Sandra Loosemore (fix error language)
V4, 11 Jun 1989, Sandra Loosemore (Gregor's amendment)
Status: Ready for release
Problem Description:
Do the CLOS defining macros (DEFCLASS, DEFMETHOD, DEFGENERIC, and
DEFINE-METHOD-COMBINATION) have compile-time side-effects similar
to those for DEFSTRUCT or DEFMACRO?
A part of the problem is that we do not currently have a full
understanding of all the issues involved. In particular, work on
defining the CLOS meta-object protocol is still in progress. The goal
of this proposal is to say enough about the behavior of these macros
in the standard so that users can use them portably in compiled code,
but to leave as much of the behavior as possible unspecified to avoid
placing undue restrictions on the meta-object protocol.
Proposal CLOS-MACRO-COMPILATION:MINIMAL:
State that top-level calls to the CLOS defining macros have the
following compile-time side-effects. Any other compile-time behavior
is explicitly left unspecified.
DEFCLASS:
* The class name may appear in subsequent type declarations.
* The class name can be used as a specializer in subsequent
DEFMETHOD forms.
DEFGENERIC:
* The generic function can be referenced in subsequent DEFMETHOD forms.
* The generic function is not callable at compile-time.
DEFMETHOD:
* The method is not callable at compile-time. If there is a generic
function with the same name at compile-time, compiling a DEFMETHOD
will not add the method to that generic function.
The error-signalling behavior described in the specification of
DEFMETHOD in CLOS chapter 2 (if the function isn't a generic function
or if the lambda-list is not congruent) happens only when the defining
form is executed, not at compile time.
The forms in EQL specializers are evaluated when the defining form
is executed. The implementation may try to evaluate them at
compile time, but must not depend on being able to do so.
DEFINE-METHOD-COMBINATION:
* The method combination can be used in subsequent DEFGENERIC forms.
The body of a DEFINE-METHOD-COMBINATION form is evaluated no earlier
than when the defining macro is executed and possibly as late as
generic function invocation time. The compiler may attempt to
evaluate these forms at compile time but must not depend on being able
to do so.
Rationale:
The compile-time behavior of DEFCLASS is similar to DEFSTRUCT or
DEFTYPE.
DEFGENERIC and DEFMETHOD are similar to DEFUN, which doesn't add the
function definition to the compile-time environment. Since generic
functions may be freely redefined between compile and run time (just
like any other function), a method may end up "belonging" to a
different generic function at load time than at compile time. This
is why it is inappropriate to signal errors about congruency problems
(etc) until the method is actually added to the generic function at
run time.
Current Practice:
The items listed under DEFCLASS in proposal MINIMAL are fairly standard
programming style.
Flavors does not support compile-time instantiation of classes. It
does not make method combinations available at compile-time either, but
Moon considers that to be a bad design choice.
Cost to implementors:
Unknown.
Cost to users:
Unknown, but probably fairly small.
Wrapping an (EVAL-WHEN (EVAL COMPILE LOAD) ...) around the appropriate
definitions will make sure they are fully defined at compile-time.
Alternatively, the definitions could be placed in a separate file,
which is loaded before compiling the file that depends on those
definitions.
Benefits:
Programmers can rely on programs that use the CLOS defining macros
being able to compile correctly in all implementations, without having
to wrap explicit EVAL-WHENs around every macro call.
Discussion:
This writeup is based on discussions between Moon, Gray, and
Loosemore, who are mostly in agreement on the things presented in
proposal MINIMAL. We have purposely avoided saying anything about
whether meta-objects representing the classes, methods, etc. get
created at compile-time, or whether such meta-objects are fully or
partially defined. The basic questions addressed by this issue are
what kinds of things can be defined and then used during compilation
of the same file that defines them, and what restrictions might apply.
These proposals are not completely compatible with the meta-object
protocol document (89-003). Gregor Kiczales says:
No one believes that what is written in draft 10 of the MOP is valid.
Sandra Loosemore says:
Although I admit I don't understand all of the issues involved with
the meta-object protocol, I prefer proposal MINIMAL over
NOT-SO-MINIMAL. I don't think leaving the issue of whether or not
classes can be instantiated at compile-time unspecified places an
undue burden on users, and it does leave more freedom for the
meta-object protocol to decide what the right behavior really is.
Dick Gabriel notes:
The question I have about the process going on with respect to the
CLOS-MACRO-COMPILATION issue is whether the fine-grained behavior of
CLOS under various compilation/evaluation situations is being
over-specified.
At this stage of the game I worry that we might go a little too far in
one direction in specification when we are actually engaged in design
work. This isn't intended to be a criticism of any committees, but I
would feel a lot more comfortable with a conservative specification
that defined well-formed programs being loaded or compiled in fresh
Common Lisps with a pretty simple eval-when model that is easier to
specify and which makes it easy for all but the hairiest
compilation-environment-frobbing programs to be written.
∂15-Jun-89 0918 X3J13-mailer issue COMPILED-FUNCTION-REQUIREMENTS, version 6
Received: from cs.utah.edu by SAIL.Stanford.EDU with TCP; 15 Jun 89 09:18:31 PDT
Received: from defun.utah.edu by cs.utah.edu (5.61/utah-2.1-cs)
id AA15901; Thu, 15 Jun 89 10:18:50 -0600
Received: by defun.utah.edu (5.61/utah-2.0-leaf)
id AA23875; Thu, 15 Jun 89 10:18:47 -0600
Date: Thu, 15 Jun 89 10:18:47 -0600
From: sandra%defun@cs.utah.edu (Sandra J Loosemore)
Message-Id: <8906151618.AA23875@defun.utah.edu>
To: x3j13@sail.stanford.edu
Subject: issue COMPILED-FUNCTION-REQUIREMENTS, version 6
Reply-To: cl-compiler@sail.stanford.edu
The previous version of this issue was discussed briefly at the last
meeting.
Forum: Compiler
Issue: COMPILED-FUNCTION-REQUIREMENTS
References: CLtL p. 32, 76, 112, 143, 438-439
Issue FUNCTION-TYPE (passed)
Issue COMPILER-LET-CONFUSION (passed)
Issue EVAL-WHEN-NON-TOP-LEVEL (passed)
Issue LOAD-TIME-EVAL (passed)
Issue COMPILE-ENVIRONMENT-CONSISTENCY
Issue COMPILE-ARGUMENT-PROBLEMS (passed)
Category: CLARIFICATION, CHANGE
Edit History: V1, 3 Jan 1989 Sandra Loosemore
V2, 10 Jan 1989, Sandra Loosemore (additional proposal)
V3, 10 Feb 1989, Sandra Loosemore (new proposal)
V4, 11 Mar 1989, Sandra Loosemore (fix wording to agree
with other pending proposals)
V5, 23 Mar 1989, Sandra Loosemore (restore proposal FLUSH)
V6, 30 May 1989, Sandra Loosemore (fix proposal TIGHTEN to
apply only to COMPILE-FILE)
Status: Ready for release
Problem Description:
There is confusion about what functions might be or must be of type
COMPILED-FUNCTION, and what attributes must be true of
COMPILED-FUNCTIONs. Is the distinction between COMPILED-FUNCTIONs and
other functions only one of representation, or can user programs infer
anything about COMPILED-FUNCTIONs? Are implementations required to
distinguish between compiled and non-compiled functions?
CLtL defines a COMPILED-FUNCTION as "a compiled code object". (Issue
FUNCTION-TYPE says only that COMPILED-FUNCTION must be a subtype of
FUNCTION.) Although it is not explicitly stated, CLtL implies that
compiled code must conform to certain rules; in particular, it states
that all macros are expanded at compile time, and specifies different
behavior for the COMPILER-LET and the EVAL-WHEN special forms
depending on whether they are interpreted or compiled.
The description of COMPILE in CLtL says that "a compiled-function object
[is] produced". It is not clear to everyone whether this implies that
COMPILED-FUNCTION-P must be true of such functions. CLtL says nothing
about whether functions defined in files compiled with COMPILE-FILE and
subsequently loaded must be of type COMPILED-FUNCTION.
This proposal presents a simple model of the compilation process. A
minimal compiler could be implemented to perform a code walk to apply
the indicated transformations to the function source code. Of course,
most compilers will perform other transformations as well, such as
translating the Lisp source code into a representation that is more
compact or which can be executed more efficiently.
Proposal COMPILED-FUNCTION-REQUIREMENTS:TIGHTEN:
(1) Clarify that if a function is of type COMPILED-FUNCTION, the
following are guaranteed about the function:
- All macro calls appearing lexically within the function have
already been expanded and will not be expanded again when the
function is called. (See CLtL p. 143.) The process of
compilation effectively turns MACROLET and SYMBOL-MACROLET
constructs into PROGNs with all instances of the local macros
in the body fully expanded.
- The compiler must capture declarations to determine whether
variable bindings and references appearing lexically within
the function are to be treated as lexical or special.
- Lexically nested EVAL-WHENs have been processed as stated in
proposal EVAL-WHEN-NON-TOP-LEVEL; either the body is treated as
an implicit PROGN or as the constant NIL.
- If the function contains lexically nested LOAD-TIME-VALUE forms,
these have already been pre-evaluated and will not be evaluated
again when the function is called.
(2) Implementations are free to classify all functions as
COMPILED-FUNCTIONs, provided that all functions satisfy the criteria
listed in item (1). It is also permissible for functions that are
not COMPILED-FUNCTIONs to satisfy the above criteria.
(3) Clarify when functions are defined in a file which is compiled
with COMPILE-FILE, and the compiled file is subsequently LOADed,
objects of type COMPILED-FUNCTION result.
(4) Clarify that COMPILE may or may not produce an object of type
COMPILED-FUNCTION; if the implementation cannot compile a function,
it may simply do nothing at all. Change the description of
COMPILE (from proposal COMPILE-ARGUMENT-PROBLEMS:CLARIFY) to state
that the behavior of COMPILE when passed a function that was
defined in a non-null lexical environment is unspecified (rather
than "is an error").
Rationale:
This proposal allows users to count on COMPILE-FILE always producing
objects that are COMPILED-FUNCTION-P. It also allows for the
possibility that COMPILE may not actually do anything interesting
in some implementations.
Some specific properties are assigned to compiled functions. Users
would be able to rely on any function which is of type
COMPILED-FUNCTION having really been (at least partially) compiled.
It also states what many people believe to be the minimum functionality
required of a compiler.
Current Practice:
It appears that most implementations currently distinguish compiled
versus non-compiled functions on the basis of representation. It seems
unlikely that any implementation would have problems satisfying the
stated minimum requirements for compilation.
Lucid uses the same representation for both compiled and non-compiled
functions, except there is a bit in the header used to distinguish them.
A-Lisp uses the same representation for both compiled and interpreted
functions and currently labels them both as COMPILED-FUNCTION, but the
implementation of COMPILED-FUNCTION-P could be easily fixed to
distinguish "real" compiled functions.
On the TI Explorer, the COMPILE function can return an object of
either type COMPILED-FUNCTION or LEXICAL-CLOSURE, where the latter
consists of two components -- an environment and a COMPILED-FUNCTION.
There is confusion about whether microcoded functions should be
considered compiled or not.
In Utah Common Lisp, COMPILED-FUNCTION-P currently returns true of all
function objects, but there is an internal tag field in the object
which allows real compiled functions to be distinguished from
interpreted functions.
Cost to implementors:
Unknown, but probably not too great. Many implementations will
probably have to make some minor changes to representation of
functions and/or to the definition of COMPILED-FUNCTION-P, but
probably most of those changes are necessary to support the
FUNCTION-TYPE proposal anyway.
Cost to users:
Probably minimal. Since the COMPILED-FUNCTION type specifier is
currently ill-defined, it is hard to imagine that existing programs
can portably rely on any interpretation of what it means that is
inconsistent with what is presented here.
Benefits:
The specification of what the compiler must do is made more explicit.
Discussion:
This writeup originally contained two other proposals, FLUSH and
TIGHTEN-COMPILE. A straw vote at the March 1989 meeting indicated
that proposal TIGHTEN had the most support.
The FIXNUM and BIGNUM types were also defined in CLtL solely on the
basis of distinguished representations, and that this definition has
proved inadequate for just about all portable usages of these type
specifiers. Defining COMPILED-FUNCTION solely on the basis of
distinguished representation seems like a bad idea.
David Gray notes:
We make good use of the type COMPILED-FUNCTION in our implementation,
but all of the accessor functions for objects of that type are
non-standard, which makes me wonder if it might be best to just remove
this type from the standard along with BIGNUM.
One use of the COMPILED-FUNCTION type is in declarations. A-Lisp and
Lucid, for example, can compile FUNCALL more efficiently if it can be
determined that the function is of type COMPILED-FUNCTION. However,
in order for such declarations to be really useful, there should be a
way to construct an object which is guaranteed to be of type
COMPILED-FUNCTION.
Moon says:
I much prefer the option FLUSH...
This type has no portable meaning and never should have existed.
Pierson says:
What I (and believe Kent) want is a guarantee that [COMPILE] won't
signal an error; if nothing else works COMPILE will simply apply
#'IDENTITY to the symbol's function. Specifically, it should be
legal and safe to attempt to speed up my current program(s) by
doing:
(DO-SYMBOLS (SYM <my-package>)
(WHEN (FBOUNDP SYM) (COMPILE SYM)))
∂15-Jun-89 0913 X3J13-mailer issue COMPILE-FILE-SYMBOL-HANDLING, version 4
Received: from cs.utah.edu by SAIL.Stanford.EDU with TCP; 15 Jun 89 09:13:39 PDT
Received: from defun.utah.edu by cs.utah.edu (5.61/utah-2.1-cs)
id AA15840; Thu, 15 Jun 89 10:13:59 -0600
Received: by defun.utah.edu (5.61/utah-2.0-leaf)
id AA23870; Thu, 15 Jun 89 10:13:56 -0600
Date: Thu, 15 Jun 89 10:13:56 -0600
From: sandra%defun@cs.utah.edu (Sandra J Loosemore)
Message-Id: <8906151613.AA23870@defun.utah.edu>
To: x3j13@sail.stanford.edu
Subject: issue COMPILE-FILE-SYMBOL-HANDLING, version 4
Reply-To: cl-compiler@sail.stanford.edu
This is a new proposal based on discussion of version 2 at the last
meeting.
Forum: Compiler
Issue: COMPILE-FILE-SYMBOL-HANDLING
References: CLtL p. 182
Issue IN-PACKAGE-FUNCTIONALITY (passed)
Issue CONSTANT-COMPILABLE-TYPES (passed)
Issue DEFPACKAGE (passed)
Category: CHANGE/CLARIFICATION
Edit History: V1, 01 Feb 1989, Sandra Loosemore
V2, 12 Mar 1989, Sandra Loosemore (discussion, error terms)
V3, 18 Apr 1989, Sandra Loosemore (new proposal)
V4, 15 Jun 1989, Sandra Loosemore (minor wording changes)
Status: Ready for release
Problem Description:
It is not clear how COMPILE-FILE is supposed to specify to LOAD how
symbols in the compiled file should be interned. In particular, what
happens if the value of *PACKAGE* is different at load-time than it
was at compile-time, or if any of the packages referenced in the file
are defined differently?
There are two models currently being used to implement this behavior:
(1) Symbols appearing in the output file produced by COMPILE-FILE
are qualified with package prefixes in the same way that PRINT
would qualify them. Symbols that are accessible in *PACKAGE*
at compile-time are looked up in *PACKAGE* at load-time. (This
is the "current-package" model.)
(2) Symbols appearing in the output file produced by COMPILE-FILE
are always qualified with the name of their home package,
regardless of the value of *PACKAGE*. (This is the
"home-package" model.)
Proposal COMPILE-FILE-SYMBOL-HANDLING:NEW-REQUIRE-CONSISTENCY:
In order to guarantee that compiled files can be loaded correctly,
users must ensure that the packages referenced in the file are defined
consistently at compile and load time. Conforming Common Lisp programs
must satisfy the following requirements:
(1) The value of *PACKAGE* when a top-level form in the file is processed
by COMPILE-FILE must be the same as the value of *PACKAGE* when the
code corresponding to that top-level form in the compiled file is
executed by the loader. In particular:
(a) Any top-level form in a file which alters the value of *PACKAGE*
must change it to a package of the same name at both compile and
load time.
(b) If the first top-level form in the file is not a call to
IN-PACKAGE, then the value of *PACKAGE* at the time LOAD is
called must be a package with the same name as the package that
was the value of *PACKAGE* at the time COMPILE-FILE was called.
(2) For all symbols appearing lexically within a top-level form that
were accessible in the package that was the value of *PACKAGE*
during processing of that top-level form at compile time, but
whose home package was another package, at load time there must
be a symbol with the same name that is accessible in both the
load-time *PACKAGE* and in the package with the same name as the
compile-time home package.
(3) For all symbols in the compiled file that were external symbols in
their home package at compile time, there must be a symbol with the
same name that is an external symbol in the package with the same name
at load time.
If any of these conditions do not hold, the package in which LOAD looks
for the affected symbols is unspecified. Implementations are permitted
to signal an error or otherwise define this behavior.
If all of these conditions hold, then when a compiled file is
loaded, the interned symbols it references are found as if by calling
INTERN with two arguments, the name of the symbol and the package
with the same name as the compile-time symbol's home package. (Note
that for a symbol that was accessible in *PACKAGE* at compile time,
this must give the same result as passing the load-time value of
*PACKAGE* to INTERN, due to restriction 2.) If no such package exists,
an error is signalled.
Rationale:
This proposal is merely an explicit statement of the status quo,
namely that users cannot depend on any particular behavior if the
package environment at load time is inconsistent with what existed
at compile time.
This proposal supports both the current-package and home-package
models as implementation techniques.
Current Practice:
PSL/PCLS implements the home-package model, as does A-Lisp. Utah
Common Lisp implements the current-package model, but the chief
compiler hacker says he thinks that the home-package model
actually makes more sense, and agrees that any program that behaves
differently under the two proposals is broken.
The TI Explorer currently implements the home-package model, after
trying it both ways.
KCL also implements the home-package model.
Lucid Lisp appears to implement the current-package model.
Symbolics Genera implements the current-package model. Symbolics
Cloe probably does also.
Coral also implements the current-package model.
Cost to implementors:
Proposal NEW-REQUIRE-CONSISTENCY is intended to be compatible with either
of the two models, but it may not be entirely compatible with the
details of current implementations.
In particular, some implementations that use the current-package
model appear to restrict IN-PACKAGE to being the first top-level
form in the file and dump all symbols referenced in the file after
the entire file has been processed (so that the value of *PACKAGE*
used to determine whether to qualify symbols in the output file is
the same for all symbols in the file). Under this proposal, these
implementations would have to note when the value of *PACKAGE*
changes during processing of a top-level form.
Cost to users:
Any user program that would break under proposal NEW-REQUIRE-CONSISTENCY
is probably already nonportable, since this proposal is intended to
leave the behavior unspecified where it would differ under the
two implementation models.
For a discussion of how the two models treat nonportable or erroneous
programs, see the "Analysis" section below.
Benefits:
COMPILE-FILE's treatment of symbols is made explicit in the standard.
Analysis:
The two implementation models differ in the following situations.
Proposal NEW-REQUIRE-CONSISTENCY, in effect, says that valid programs do
not cause any of these situations to occur, and the behavior in such
cases is unspecified (allowing both models to be used as valid
implementation techniques).
(1) The situation where the file does not contain a IN-PACKAGE
and where the compile-time value of *PACKAGE* is a package with a
different name than the load-time value of *PACKAGE*.
The current-package model would intern the names of symbols that
were accessible in *PACKAGE* at compile time in *PACKAGE* at load time.
The home-package model would intern the names of symbols that
were accessible in *PACKAGE* at compile time in the package with
the same name as their compile-time home package.
In general, programs must be compiled in the "right" package, so
that the compiler can find and apply the correct macro expansions,
type definitions, and so on; see issue COMPILE-ENVIRONMENT-CONSISTENCY.
As a result of macroexpansion or other transformations applied by
the compiler, the compiled file may contain symbol references that
were not present in the source file. The current-package model may
cause problems because these references may be resolved to be
symbols other than the ones that were intended. The home-package
model is more likely to find the correct symbols at load time.
(2) The situation where there is a symbol accessible in the
compile-time value of *PACKAGE* but with another home package, and
where at load time there is not a symbol with the same name that
is accessible in both packages. This situation might occur, for
example, if at compile time there is a symbol that is external in
its home package and that package is used by *PACKAGE*, but where
there is no such external symbol in that package at load time, or
the load-time *PACKAGE* does not use the other package.
The current-package model would find or create a symbol accessible
in *PACKAGE*.
The home-package model would find or create a symbol accessible in
a package with the same name as the symbol's compile-time home
package.
Some people feel that the behavior of the current-package model is
more intuitive in this situation, and that it is more forgiving of
differences between the compile-time and load-time package
structures. Others feel that the behavior of the home-package
model is more intuitive, and that if there have been significant
changes to the package structures, it is probably an indication
that the file needs to be recompiled anyway, since the compiler
might have picked up macro definitions and the like from the
wrong package.
(3) The situation where a symbol is external in its home package
and where there is no such external symbol in that package at load
time.
The current-package model would quietly find or create the symbol
in *PACKAGE* if the symbol were accessible in *PACKAGE* at compile
time. Otherwise, it will signal an error.
The home-package model would always just quietly find or create the
symbol as internal in its home package.
Not complaining when a symbol that is supposed to be external
isn't can be seen as a violation of modularity. However, it seems
like this argument should apply equally to symbols whose home
package is *PACKAGE* as symbols whose home package is somewhere
else.
Discussion:
There has been some further and lengthy discussion on the question of
whether this proposal overspecifies the behavior of COMPILE-FILE and
LOAD. At least one person would like the standard to say nothing on
this issue beyond a statement of the goal that loading a compiled file
should exhibit the same behavior as loading its source file. We have
also considered another approach to the problem that would place more
stringent requirements on conforming programs and fewer requirements
on implementations. Neither of these alternatives seems to have much
support, though.
Forum: Compiler
Issue: CONSTANT-FUNCTION-COMPILATION
References: Issue CONSTANT-COMPILABLE-TYPES
Category: CLARIFICATION
Edit History: V1, 22 Mar 1989, Sandra Loosemore (split from issue
CONSTANT-COMPILABLE-TYPES)
Status: **DRAFT**
Problem Description:
Can objects of type FUNCTION (or some subset of FUNCTIONs) appear as
quoted or self-evaluating constants in compiled code?
There are two questions that must be answered:
- How does one test whether a particular function is a member of the
subset of functions that are dumpable?
- For those functions that are dumpable, how do COMPILE-FILE and LOAD
arrange for an "equivalent" copy of the function in the source code to
be created in the compiled code?
This writeup uses terminology from issue CONSTANT-COMPILABLE-TYPES:
"source code", "compiled code", and "similar as constants".
Proposal CONSTANT-FUNCTION-COMPILATION:NO:
Objects of type FUNCTION are not supported in compiled constants.
Rationale:
Nobody has been able to come up with a well-defined specification of
how the compiler and loader would be required to reconstruct
function constants that would work for all functions.
Nobody has been able to come up with a well-defined specification of
some subset of functions that could be dumped.
Current Practice:
Coral can dump compiled functions, but not foreign functions.
The TI Explorer cannot dump closures (either compiled or evaluated),
but can dump non-closure compiled functions.
Symbolics Genera can't dump closures either.
Kyoto Common Lisp can't dump any functions.
Cost to implementors:
None. Implementations that can dump (some subset of) functions may
continue to do so, since issue CONSTANT-COMPILABLE-TYPES permits
implementations to extend the notion of "similar as constants".
Cost to users:
None. Programs that depend on being able to dump functions are
already nonportable, since not all implementations can dump all
functions and there is no portable way to construct or test for
functions that are dumpable in those implementations.
Benefits:
Users will know what to (or what not to) expect when using functions
in compiled constants.
Discussion:
This issue was split from issue CONSTANT-COMPILABLE-TYPES because it
appeared to be controversial enough to merit separate discussion.
Cris Perdue originally suggested:
Only function constants that are not compiled-functions and do not
close over any (lexical) variables are supported in compiled
constants.
Two such functions are similar as constants if their
SOURCE-LAMBDA-EXPRESSIONs are similar as constants.
Dick Gabriel responded:
I guess I pretty strongly object to leaving functions out of the list
of constants that can appear in compiled code. The part that's
disturbing is that such non-Lispy things like arrays, hashtables, and
pathnames get better treatment than functions, the most Lispy part of
Common Lisp. I wonder how many implementations will be forced to come
within an inch of the required functionality to implement a first-rate
CLOS?
The specification of the subset of functions that are acceptable as
compiled constants cannot be tested for within Common Lisp itself.
I suggest we ask implementors (Lucid included) to bite the bullet and
handle this case correctly. Won't our grandchildren appreciate us
treating Common Lisp like Lisp and not like PASCAL?
If we were to specify that all functions could appear as constants, we
would also need to clarify whether the closed-over variable bindings
become immutable, and also deal with whether bindings that are closed
over more than one function retain their uniqueness. Also, the cost
to implementors to add support for dumping non-interpreted functions
may be quite high.
Issue: PROCLAIM-ETC-IN-COMPILE-FILE
References: CLtL p. 182 [package functions],
p. 156 [PROCLAIM], p. 439 [COMPILE-FILE];
Issue COMPILE-FILE-HANDLING-OF-TOP-LEVEL-FORMS
Issue IN-PACKAGE-FUNCTIONALITY
Issue EVAL-WHEN-NON-TOP-LEVEL
Issue DEFINING-MACROS-NON-TOP-LEVEL
Category: CLARIFICATION, CHANGE, ADDITION
Edit History: 15 Sep 88, V1 by David Gray
23 Sep 88, V2 by Sandra Loosemore (summarize discussion)
11 Mar 89, V3 by Sandra Loosemore (rewrite)
13 Mar 89, V4 by Sandra Loosemore (discussion)
Status: **DRAFT**
Problem Description:
Should the compiler treat top-level calls to PROCLAIM specially?
Page 182 of CLtL says that COMPILE-FILE needs to treat top-level calls
to the following package functions as though they were wrapped in an
(EVAL-WHEN (COMPILE LOAD EVAL) ...):
EXPORT IMPORT IN-PACKAGE MAKE-PACKAGE SHADOW
SHADOWING-IMPORT UNEXPORT UNUSE-PACKAGE USE-PACKAGE
CLtL is silent on whether top-level calls to PROCLAIM should also be
evaluated at compile-time, which presumably means they shouldn't be.
However, some implementations do evaluate PROCLAIM at compile-time.
In the model of how COMPILE-FILE works that is presented in issues
EVAL-WHEN-NON-TOP-LEVEL and DEFINING-MACROS-NON-TOP-LEVEL, the special
form EVAL-WHEN is the only thing that can cause compile-time evaluation
to occur. The compile-time side-effects of macros such as DEFMACRO
and DEFPACKAGE are explained by having them include EVAL-WHEN in their
expansions. Any functions that are treated specially, however, must
be included as special cases in the compiler.
Proposal IN-PACKAGE-FUNCTIONALITY:NEW-MACRO would remove the
requirement that the package functions be treated specially. Do we
wish to make an exception to the model for PROCLAIM?
Proposal PROCLAIM-ETC-IN-COMPILE-FILE:YES:
Require COMPILE-FILE to treat top-level calls to PROCLAIM as if they
were wrapped in an (EVAL-WHEN (COMPILE LOAD EVAL) ...).
Rationale:
Proclamations affect compilation semantics and should be made
available to the compiler.
Proposal PROCLAIM-ETC-IN-COMPILE-FILE:NO:
Clarify that calls to PROCLAIM should be treated the same as any
other function call. Users should wrap an explicit EVAL-WHEN around
top-level calls to PROCLAIM if they want them to affect compilation.
Rationale:
This makes the semantics of COMPILE-FILE more uniform and easier
to understand. In particular, if we remove the magic compile-time
behavior of the package functions, it seems silly to add another
exception for PROCLAIM.
Proposal PROCLAIM-ETC-IN-COMPILE-FILE:NEW-MACRO:
Add a new macro:
DEFPROCLAIM &rest decl-specs [Macro]
This macro PROCLAIMs the given <decl-specs>, which are not
evaluated. If a call to this macro appears at top-level in a file
being processed by the file compiler, the proclamations are also
made at compile-time. As with other defining macros, it is
unspecified whether or not the compile-time side-effects of a
DEFPROCLAIM persist after the file has been compiled.
Clarify that calls to PROCLAIM should be treated the same as any
other function call. Users should wrap an explicit EVAL-WHEN around
top-level calls to PROCLAIM if they want them to affect compilation,
or use the macro DEFPROCLAIM.
Rationale:
The macro makes the proclamations available to the compiler in such
a way that does not require any special exceptions to be made in
the model of how COMPILE-FILE works.
Current Practice:
The TI explorer apparently implements proposal YES, except that
(EVAL-WHEN (LOAD) (PROCLAIM '(OPTIMIZE ...))) doesn't do anything.
The Symbolics compiler has special top-level handling for PROCLAIM,
although the details are not clear.
Lisps developed at Utah (UCL, A-Lisp, PSL/PCLS) do not give PROCLAIM
any special compile-time handling.
Lucid does not evaluate calls to PROCLAIM at compile-time.
The IIM compiler has special top-level handling for PROCLAIM when
the argument is a constant. The information is recorded in the remote
environment.
Cost to implementors:
Since implementations are already required to have a mechanism for
compile-time handling of the package functions, it would probably
only require minor adjustments to add handling for PROCLAIM.
Cost to users:
For proposal YES, users would have no way to suppress compile-time
evaluation of a top-level call to PROCLAIM. Wrapping it in an
(EVAL-WHEN (EVAL LOAD)...) wouldn't work under the model of how
EVAL-WHEN works in proposal EVAL-WHEN-NON-TOP-LEVEL:GENERALIZE-EVAL.
Under any of these proposals, some users would probably have to
make minor changes to their code.
Benefits:
Users will know what to expect when they use PROCLAIM.
Costs of Non-Adoption:
Users will not know what to expect when they use PROCLAIM.
Aesthetics:
At least two people consider requiring magic behavior for certain
top-level function calls to be "semantically bletcherous". Removing
all special cases for functions that are implicitly evaluated at
compile-time would simplify the model of how COMPILE-FILE works.
Programs look cleaner if EVAL-WHEN is only needed for unusual cases
instead of being required for the normal cases.
Discussion:
The first version of this writeup also included REQUIRE with PROCLAIM,
but we have now voted to remove REQUIRE from the language entirely.
It also specified that OPTIMIZE proclamations should only have a local
effect within the file being compiled. This was removed for
consistency with other compile-time side-effects (such as those from
DEFMACRO), where their persistence is explicitly left unspecified.
Loosemore favors proposal NO, but wouldn't oppose proposal NEW-MACRO.
Kim Barrett says:
Proposal YES violates the general approach we've been taking of trying
to limit side-effects on the local environment during compilation.
Proposal NO makes PROCLAIM virtually worthless.
Proposal NEW-MACRO -- While this matches up with other stuff we've
been doing, I'm concerned about two things. First, I really dislike
the name DEFPROCLAIM. This thing isn't defining anything! It sounds
like something that modifies the behavior of PROCLAIM, not something
that actually makes a proclamation. Second, I'm concerned about the
cost to users. I think the statement that
"Under any of these proposals, some users would probably have to make
minor changes to their code."
is rather misleading for this case. There are a lot of PROCLAIMs out
there.
Loosemore replies:
....but all of those uses of PROCLAIM are already nonportable. No
matter what we do here, somebody is going to get burned.
Suggestions for better names for the macro are welcome.